Experience-Based Language Processing
Experience-Based Language Processing
Abstract and Keywords
The multiple-cue integration perspective on language acquisition highlights the rich nature of the input. In combination with the emphasis on the cultural evolution of language, this points to an experience-based account of language processing, in which exposure to language plays a crucial role in determining language ability. The sixth chapter therefore emphasizes the importance of experience for understanding language processing, focusing on the processing of relative clauses as an example. Evidence from corpus analyses, computational modeling, and psycholinguistic experimentation demonstrates that variation in relative clause processing—including differences across individuals—can be explained by variations in linguistic experience. Additional experimental data suggest that individual differences in domain-general abilities for sequence learning and memory-based chunking, in turn, may affect individuals’ ability to learn from linguistic experience. It is concluded that our language abilities emerge through complex interactions between linguistic experience and multiple constraints deriving from learning and processing.
Our native language is like a second skin, so much a part of us we resist the idea that it is constantly changing, constantly being renewed.
Although we may not give it much thought, we are generally aware that the language we speak changes gradually over time. We regularly come across new words that quickly enter into our vocabulary. For example, the word selfie has rapidly become part of everyday English, referring to a photo of oneself (sometimes with other people and/or at notable locations or events), typically using a smartphone and uploaded to social networking services, such as Facebook, Snapchat, or Twitter. In 2013, selfie was selected as Word of the Year for both US and UK English by the Oxford English Dictionary.1 The word (and the concept to which it refers) has been widely adopted across the world, with famous selfies including one taken by the Danish Prime Minister Helle Thorning Schmidt with US President Barack Obama and British Prime Minister David Cameron at Nelson Mandela’s memorial service in December 2013, and comedian Ellen DeGeneres’ selfie with nine Hollywood celebrities at the 2014 Oscar ceremony (which became the most retweeted photo ever with more than 3.3 million retweets2). The word selfie has even spawned activity-related variations such as helfie (a selfie of one’s hair), welfie (a selfie during a workout), and drelfie (a selfie taken while being drunk). Of course, as new words like selfie enter into our vocabulary, others gradually disappear (e.g., the word fax discussed in chapter 3), resulting in an ever-changing vocabulary.
Whereas the shifting nature of our vocabulary is readily apparent, our intuition suggests that other aspects of language, including how we pronounce (p.170) words (phonology) and put them together to form sentences (syntax), are much more stable. However, as we already noted in chapter 3, our phonology is also subject to change over our lifetime, as evidenced by the drift in the vowels produced by the Queen of England in her yearly Christmas messages recorded between the 1950s and 1980s (Harrington et al., 2000). It is even possible to lose one’s native language accent in circumstances where a second language becomes the primary means of communication (de Leeuw, Schmid, & Mennen, 2010). Such “first language attrition” can occur when adult immigrants live in linguistic environments with limited opportunity for utilizing their first language, resulting in a foreign accent when speaking their native language. But what about syntax? Does our grammatical knowledge change across the lifespan?
The standard generative perspective tends to see grammatical knowledge as being largely fixed after language acquisition is completed in childhood. Pinker (1994, p. 294) expressed this perspective succinctly: “… learning a language—as opposed to using a language—is perfectly useful as a one-shot skill. Once the details of the local language have been acquired from the surrounding adults, any further ability to learn (aside from vocabulary) is superfluous.” Note, too, that Pinker highlights the separation of acquisition from processing (learning vs. using a language) that is characteristic of generative approaches to language from current versions of the Principles and Parameters Theory (e.g., Crain, Goro & Thornton, 2006; Crain & Pietroski, 2006), to the Simpler Syntax framework (Culicover & Jackendoff, 2005; Jackendoff, 2007; Pinker & Jackendoff, 2005) and the Minimalist Program (Boeckx, 2006; Chomsky, 1995), as discussed in chapter 1.
In contrast, we view acquisition and processing as fundamentally intertwined. As we argued in chapter 4, acquisition involves learning to carry out incremental Chunk-and-Pass processing to overcome the effects of the Now-or-Never bottleneck. This perspective assigns a fundamental role to linguistic experience in developing the appropriate processing skills to deal with the continual onslaught of the input, given various cognitive and communicative constraints. A key prediction from this account is that, just like the acquisition of any other skill, language learning never stops; we should continuously be influenced by new language input, and constantly be upgrading our language processing abilities, including our knowledge of syntactic regularities. Of course, this does not mean that the language of adults and children changes at the same rate. Consider how adding a teaspoon of red coloring to a glass of water will change the overall color considerably, whereas adding the same amount of color to a bathtub full of water has little effect. Likewise, linguistic input that has a substantial effect on a child’s emergent processing skills may (p.171) have relatively little impact on an adult language system, shaped by many years of accumulated linguistic experience. Still, with sufficient exposure adult language can change too, just as the bathtub water will turn red if we add enough coloring.
If linguistic experience plays an important role in shaping our native language ability, we would expect substantial individual differences in people’s language skills as a function of variation in input. It is well established that differences in linguistic input can result in dramatic differences in vocabulary acquisition (e.g., Hoff, 2006). For example, a classic study by Hart and Risley (1995) found that middle-class children heard more than three times the number of words per hour compared to children in welfare-recipient families. At just three years of age, children from middle-class families had a cumulative vocabulary that was more than twice as large as the children from families receiving welfare. These childhood differences in vocabulary have been found to predict later language outcomes in large-scale studies (e.g., Burchinal et al., 2011; Farkas & Beron, 2004).
Effects of social background have also been found for basic language processing skills in infancy (Fernald, Marchman, & Weisleder, 2013). Longitudinal research indicates that these differences in language processing skill are associated with variations in the amount and richness of maternal speech (Hurtado, Marchman, & Fernald, 2008). Variations in linguistic input further predict the complexity of children’s later syntactic development, both in terms of comprehension and production (Huttenlocher, Vasilyeva, Cymerman, & Levine, 2002). These differences continue into adulthood, as evidenced, for instance, by the considerable differences in syntactic abilities observed as a function of education (Dąbrowska, 1997). Importantly, such differences do not stem merely from failures to process obscure sentences with strange grammatical constructions. For example, Street and Dąbrowska (2010) found that adult native speakers of English with less than eleven years of education had problems comprehending passive sentences describing simple transitive events such as The girl was photographed by the boy. Thus, there are substantial individual differences in grammatical ability across adult native speakers (see Dąbrowska, 2012; Farmer, Misyak, & Christiansen, 2012, for reviews).
We see individual differences in language comprehension and production as a reflection of the importance of linguistic experience in shaping the Chunk-and-Pass processing skills necessary to deal with the fast pace of language input. In this chapter, we focus on the processing of relative clauses as a window into the role of experience in shaping language-processing skills more broadly. We consider evidence concerning how the differential processing difficulty of specific types of relative clause constructions reflects their (p.172) distributional properties, when considered in combination with pragmatic constraints. Variation in exposure to relative clauses is further argued to be a determining factor in producing differences in the processing of these constructions across individuals. Of course, the role of linguistic experience is also shaped by the various cognitive and communicative constraints discussed in chapters 2 and 4. In this chapter, we focus on the cognitive constraints governing our ability to deal with sequentially presented material, whether linguistic or not. These sequence-processing mechanisms allow us to continuously update and maintain our knowledge of language, promoting not only long-term linguistic stability but also allowing for modification of our grammatical abilities, and hence our knowledge of the grammar, in response to changes in the input.
6.1 Processing Biases as a Reflection of Linguistic Input
Practice is generally considered to be key to successful skill learning. Similarly, to elaborate on Periander’s famous saying,3 practice makes perfect when it comes to language acquisition and processing. Through repeated experience with Chunk-and-Pass processing of various types of linguistic structure, children acquire their native language and adults hone their processing abilities. Thus, our general language processing biases relating to different types of syntactic constructions should, to a large extent, reflect the distribution of those very constructions in our linguistic input. This perspective is consistent with a number of usage-based approaches to language acquisition (e.g., Tomasello, 2003), processing (e.g., Arnon & Snider, 2010), and change (Bybee, 2006). A number of psycholinguistic studies have demonstrated the impact of distributional biases on the incremental processing of sentences involving syntactic ambiguities (e.g., Crocker & Corley, 2002; Desmet, De Baecke, Drieghe, Brysbaert, & Vonk, 2006; Jurafsky, 1996; MacDonald, Pearlmutter, & Seidenberg, 1994; Mitchell, Cuetos, Corley, & Brysbaert, 1995). Here, however, we focus on the influence of statistical information on the processing of unambiguous sentences involving embedded relative clauses.
A relative clause is a kind of subordinate clause that is typically used in English to modify a previously encountered noun or noun phrase. The previous referent can either be the subject or the object of the relative clause, as shown in (1) and (2), respectively (with the relative clauses underlined).
(2) The reporter that the senator attacked admitted the error.
In both sentences, the reporter is the subject of the main clause (the reporter… admitted the error). The two sentences differ in the role that the reporter plays in the relative clause. In subject relative clauses as in (1), the reporter is also the subject of the relative clause (the reporter attacked the senator). This contrasts with the object relative clause in (2), where the reporter is the object of the relative clause (corresponding to the senator attacked the reporter).
Psycholinguistic experiments have shown that subject relative sentences such as (1) are easier to process than their object relative counterparts such as (2) (e.g., Ford, 1983; Holmes & O’Regan, 1981; King & Just, 1991; for a review, see Gibson, 1998). Theories differ in how they explain the observed difference in processing difficulty between subject and object relative sentences. Structure-based accounts inspired by generative grammar suggest that there is a universal preference for syntactic gaps (from movement) in the subject position (Miyamoto & Nakamura, 2003). This account predicts that subject relatives should always be easier to process than object relatives, irrespective of functional or discourse considerations. In contrast, functional accounts ascribe the differential processing difficulty to cognitive or communicative constraints. For example, working-memory-based approaches suggest that object relatives are harder to process because they require the language system temporarily to store more incomplete dependencies (because the reporter is the object of the subordinate verb attacked) than subject relatives (e.g., Gibson, 1998; Lewis, 1996). Some memory-based theories further highlight possible interference between constituents held in working memory (e.g., Bever, 1970; Gordon, Hendrick, & Johnson, 2001; Lewis & Vasishth, 2005): in object relative clauses, the two noun phrases (the reporter and the senator) may interfere with one another prior to the processing of the subordinate verb (attacked). Accounts focusing on communicative constraints suggest that object relative sentences, when presented in isolation, violate discourse-based expectations for object relatives to provide information about a previously mentioned discourse-old referent (e.g., Fox & Thompson, 1990; Mak, Vonk, & Schriefers, 2008; Roland, Mauner, O’Meara, & Yun, 2012). In contrast, subject relative sentences are less problematic when presented in isolation, as they tend to involve discourse-new referents. Finally, experience-based approaches suggest that a key factor in explaining the processing of subject versus object relative clauses is their relative distribution in the linguistic experience of individual language comprehenders (e.g., MacDonald & Christiansen, 2002; Reali & Christiansen, 2007a,b).
(p.174) The Chunk-and-Pass processing perspective provides a possible experience-based framework within which to bring the cognitive and communicative factors together. As noted in chapter 4, experience is key to facilitate chunking at various levels of linguistic representations. In particular, repeated exposure to specific syntactic constructions, such as relative clauses, will make them easier to process, and allow for more chunks to be kept in memory (see also Jones, 2012). Consistent with this account, Roth (1984) found that extra experience with processing relative clauses improved three- to four-year-old children’s comprehension of these constructions in comparison with a control group that received an equal amount of exposure to sentences involving different syntactic structures. Importantly, measures of the children’s working memory indicated that the improvements in relative clause processing were not associated with increases in working memory capacity (as might be expected from pure working memory accounts; e.g., Just & Carpenter, 1992). Instead, we suggest that more efficient Chunk-and-Pass processing abilities explain such experience-based facilitation of relative clause comprehension. If so, then our ability to process different types of relative clause constructions should largely reflect their distributional properties in natural language (see box 6.1 on the widespread effects of frequency throughout cognition).
6.1.1 Pronominal Relative Clauses
Pronominal relative clauses provide a straightforward way to test this prediction. Previous work has shown that the processing difficulty associated with object relatives such as (2) is much diminished when the embedded clause involves a personal pronoun (such as you) in (3) rather than a full noun phrase (such as the senator) (e.g., Gordon et al., 2001; Warren & Gibson, 2002).
(3) The reporter that you attacked admitted the error.
To determine whether the comparative ease of processing pronominal object relative clauses might reflect the distributional properties of English, Reali, and Christiansen (2007a) conducted a large-scale corpus analysis, using the American National Corpus (ANC) (Ide & Suderman, 2004), which contains over eleven million words from both spoken and written sources. The corpus contains morpho-syntactic tags that allowed subject and object relative clause sentences to be extracted. When considering relative clause sentences with embedded full noun phrases (as in 1 and 2), subject relatives were twice as frequent as object relatives (68.3% vs. 31.7%). In contrast, when considering pronominal relative clause sentences, the picture was reversed: object relatives occurred almost twice as often as subject relatives (65.5% vs. 34.5%). (p.175)
Reali and Christiansen (2007a) then looked more closely at the pronominal relative clause results, conducting separate analyses for five different types of pronouns: first-person pronouns (I, we, me, us), second-person pronoun (you), third-person personal pronouns (she, he, they, her, him, them), third-person impersonal pronoun (it), and nominal pronouns (e.g., someone, something). These additional analyses revealed an intriguing pattern in which the pronominal relative clauses with a personal pronoun showed a distributional bias toward object relatives (first-person pronouns: 82% object relatives; second-person pronouns: 74% object relatives; third-person pronouns: 68% object relatives). However, this tendency was reversed when the relative clauses involved impersonal (34% object relatives) or nominal (22% object relatives) pronouns.
But are pronominal object relatives involving personal pronouns easier to process than their subject-relative counterparts, as would be expected if processing is facilitated by distributional frequency? To test this prediction, Reali (p.176) and Christiansen (2007a) conducted three self-paced reading experiments involving first-person (I/me), second-person (you), and third-person (they/them) personal pronouns, as well as a fourth experiment involving a third-person impersonal pronoun (it). Examples of the subject and object relative clause stimuli used in the experiment with first-person personal pronouns can be seen in (4) and (5), respectively.
(4) The lady that visited me enjoyed the meal.
(5) The lady that I visited enjoyed the meal.
In the experiment with second-person personal pronouns, the subject and object relative materials are exemplified by (6) and (7).
(6) The lady that visited you enjoyed the pool in the back of the house.
(7) The lady that you visited enjoyed the pool in the back of the house.
Because the personal pronouns they/them are so-called referring pronouns, they need to be grounded in prior context when they are processed online—i.e., there needs to be something for them to refer back to. The examples of the subject and object relative clause stimuli in (8) and (9) therefore include a short preamble (According to the students), which provides the referent for they/them.
(8) According to the students, the teacher that praised them wrote excellent recommendation letters.
(9) According to the students, the teacher that they praised wrote excellent recommendation letters.
In the last experiment, the antecedent referent for the impersonal third-person pronoun it occurred in a brief sentence prior to the sentence containing the subject or object relative clauses, as in (10) and (11), respectively.
(10) The minivan was really fast. The car that chased it lost control suddenly.
(11) The minivan was really fast. The car that it chased lost control suddenly.
All four experiments measured online sentence processing using the so-called moving-window, self-paced reading task (Just, Carpenter, & Woolley, 1982). This experimental paradigm is a “work horse” of psycholinguistic research, in which participants read sentences on a computer screen, one word at a time, pressing a key to reveal the next word. The amount of time spent on each word provides a sensitive index of the difficulty people have in processing various parts of a sentence. Reali and Christiansen (2007a) reasoned that possible differences in processing difficulty associated with the subject (p.177) and object relative versions of the sentences should show up in the two-word region that contrasted the two relative-clause types: VERB—PRONOUN (subject relatives) vs. PRONOUN—VERB (object relatives) (e.g., chased it vs. it chased in 10 and 11). The bottom of figure 6.1 shows the results from the four experiments, with reading times averaged across the critical two-word region. The top of the figure shows the predictions from distributional patterns of the same pronominal relative-clause types from the corpus analyses. As predicted by these analyses, object relatives were processed faster in sentences
(p.178) with personal pronouns in the relative clauses (I/me, you, they/them). The opposite patterns were seen for impersonal pronouns (it), consistent with the hypothesis that Chunk-and-Pass processing of specific relative clause constructions is strongly influenced by their frequency of occurrence. A similar impact of distributional patterns is also observed cross-linguistically in children’s acquisition of relative clauses, as we discuss further below.
The result of the experiment involving first-person personal pronouns (I/me) was replicated by Roland et al. (2012) with new items. They conducted additional analyses and experiments suggesting that differential discourse expectations for subject and object relatives provide an important source of constraints on relative clause constructions. Specifically, their results indicate that object relative clauses (whether involving pronouns or full noun phrases) tend to concern a discourse-old referent, whereas subject relative clauses tend to introduce discourse-new information. Extending this work, Heider, Dery, and Roland (2014) conducted corpus analyses of relative clause constructions involving the impersonal pronoun it. They found that if reduced relative clauses are included in the analysis (The car that it chased … → The car it chased…), then object relatives involving it are actually more frequent than the corresponding subject relatives, contrary to the original analyses of Reali and Christiansen (2007a). Additionally, Heider et al. were unable to replicate the results of Reali and Christiansen’s sentence processing experiment with it, instead finding either no differences between subject and object relatives or faster processing of object relatives. They concluded that discourse-based expectations trump fine-grained frequency information at the level of specific relative clause patterns (that VERB it vs. that it VERB). However, although the Chunk-and-Pass processing perspective assigns a key role to top-down constraints from discourse anticipations in language comprehension, these are likely to be intertwined with fine-grained distributional expectations for specific word combinations.4
A study by Reali and Christiansen (2007b) provides initial support for this perspective. They compared the online processing of object relative clauses in (p.179) which the combination of the first-person pronoun I and the subordinate verb either formed a high-frequency two-word chunk (I loved in 12) or low-frequency one (I phoned in 13):
(12) The actress who I loved phoned the comedian who presented his new show yesterday.
(13) The actress who I phoned loved the comedian who presented his new show yesterday.
They found that even though both sentence types were equally easy to understand offline, the sentences with the high-frequency I-VERB chunks were processed significantly faster. Additional analyses further demonstrated that not only did the frequency of the I-VERB chunk predict reading times for that critical region, but it was also associated with faster processing of the subsequent main verb. Crucially, these effects of fine-grained distributional information cannot be explained by discourse expectations or working memory constraints, as these are constant across the two sentence types.
Future work is needed to determine how fine-grained distributional cues interact with discourse-based expectations in relative clause processing. However, the Chunk-and-Pass perspective suggests that both types of constraints play a key role and are likely to be strongly affected by experience with language. We would therefore expect differences across development and between individuals in how distributional information and discourse factors will influence specific patterns of language comprehension and use, especially when combined with the variety of other cues discussed in chapter 5.
6.1.2 Cross-Linguistic and Developmental Patterns of Relative Clause Processing
The importance of fine-grained statistical information for relative clause processing becomes further evident when considering languages other than English. For example, Reali (2014) observed an English-like pattern in the distribution of relative clauses with pronouns and full noun phrases in Spanish, but also discovered a more fine-grained distributional pattern within object relative clauses. Spanish is a so-called pro-drop language, meaning that pronouns in object relative clauses may be dropped because verb inflection contains sufficient information about the pronominal subject of the subordinate clause. Thus, the pronoun nosotros (we) may be dropped in (14) without loss of information in Spanish because the inflection of the verb perseguimos unambiguously indicates that the subject is a first-person plural pronoun.
(14) El sapo que nosotros perseguimos.
[The toad that we chased]
(p.180) Reali found that Spanish object relatives overwhelmingly tended to drop pronouns (whereas subject relatives always have overt pronouns), and incorporated changes to the order of subject (S) and verb (V), depending on whether the object relative involved a full noun phrase (VS) or an overt pronoun (SV). A subsequent self-paced reading-time study showed that Spanish speakers are sensitive to such fine-grained information. The results cannot be easily explained by discourse factors because these were held constant across the variations of the object relative clauses, using the same words in the alternative word orders. These and other cross-linguistic results (e.g., see contributions in Kidd, 2011) point to language-specific patterns of relative clause processing that are likely to result from the differential weighing of multiple factors, including fine-grained statistical information, working memory constraints and discourse-based expectations (see also Kidd & Bavin, 2002; O’Grady, 2011).
Similarly, cross-linguistic patterns of development provide further insight into the nature of relative clause processing. In a study with three-, four-, and five-year-old English speaking children, Kidd and Bavin (2002) found that the ability to deal with center-embedded relative clause sentences (such as 1–11 above) emerges gradually during development. In the study, children were asked to act out sentences in which relative clauses were either right-branching (as underlined in 15–16) or center-embedded (as underlined in 17–18).
(15) The kangaroo stands on the goat that bumped the horse.
(16) The horse jumps over the pig that the kangaroo bumped.
(17) The cow that jumped over the pig bumps the sheep.
(18) The sheep that the goat bumped pushes the pig.
Kidd and Bavin found that by four years of age, children had a good grasp of right-branching subject (15) and object (16) relative clauses, whereas their mastery of center-embedded subject (17) and especially object (18) relatives lagged behind. From a Chunk-and-Pass perspective, right-branching relative clause constructions are comparatively easy to process because they do not require “open” chunks (i.e., the main clause in 15 and 16 can be chunked into a single unit and passed up for further processing). In contrast, when processing center-embedded relative clauses (in 17 and 18), the chunking of the main clause cannot be completed until the relative clause has itself been chunked, creating a memory load that interferes with the processing of incoming input. Only once chunking becomes faster and more efficient—as a function of repeated experience—do children become better at processing center-embedded relative clauses (see O’Grady, 2011, for a related perspective). Thus, the Chunk-and-Pass perspective provides a potential processing (p.181) account of children’s gradual acquisition of complex sentence structure through clause expansion, as advocated by Bloom (1991), Diessel (2004), Kidd and Bavin (2002), and Tomasello (2003) (we discuss this issue further in chapter 7).
Interestingly, experience with specific types of relative clause constructions appears to play a key role in the development of children’s processing abilities. Indeed, several studies have shown cross-linguistically that children’s processing improves markedly when they are presented with relative clause constructions that are representative of the input they receive (e.g., Brandt, Kidd, Lieven, & Tomasello, 2009; Arnon, 2010; Kidd, Brandt, Lieven, & Tomasello, 2007). For example, Arnon (2010) reports results from an analysis of Hebrew child-directed speech as well as child productions, finding in both cases that object relative clauses tended to involve pronouns (or have no overt subject), whereas subject relatives primarily contained full lexical noun phrases. When testing three- to five-year-old Israeli children on Hebrew relative clause constructions, she found that they performed better on pronominal relative clauses, especially when it came to object relatives. However, in contrast to the results with adults (Reali & Christiansen, 2007a; Roland et al., 2012), the presence of pronouns did not make subject relatives easier to comprehend and produce than object relatives. As Arnon notes, this might be because the experimental items did not closely reflect the fine-grained distributional information regarding object relatives. In the children’s input, the subject of the main clause, which is being modified by the object relative clause, is inanimate most of the time (as underlined in 19), whereas the experimental items all involved an animate main-clause subject (as underlined in 20).
(19) The ball that I like to play with.
(20) The girl that I am drawing.
Indeed, Kidd et al. (2007) found that children are sensitive to such animacy constraints in English and German and that, when presented with sentences in line with such constraints, they process object relatives as easily as subject relatives. In a related study, Brandt et al. (2009) tested English- and German-speaking three-year-olds’ comprehension of relative clauses using a referential task in which children were asked to pick a toy in response to questions such as (21) and (22):
(21) Can you give me the ball that he just threw?
(22) Can you give me the donkey that he just fed?
As predicted by corpus data, children were much better at understanding pronominal object relatives modifying an inanimate noun (as in 21) rather than (p.182) an animate noun (as in 22). Moreover, when animacy was taken into account, pronominal object relatives were better understood than pronominal subject relatives.
From the viewpoint of Chunk-and-Pass processing, the picture that emerges from these studies is that repeated exposure to specific constructions over development (and in adulthood) may result in the representation of chunk-related information at different levels of granularity, from individual word combinations (e.g., involving specific pronouns, such as I, as shown by Reali & Christiansen, 2007b) to abstract constructions (subject relatives: that—VERB—NP – object relatives: that—NP—VERB). Top-down information from discourse expectations may further lead to the formation of prototypical schemas, such as INANIMATE NP—PRONOUN—VERB for object relatives (Brandt et al., 2009). Importantly, though, such schemas are themselves probabilistic in nature, reflecting the distribution of the relevant patterns in the language in general (e.g., as suggested cross-linguistically by the results of Kidd et al., 2007, and Reali, 2014). The distribution of these patterns changes across time as children receive an increasing proportion of their input from reading: children’s production of object relatives actually drops compared to passive relative clauses (e.g., The toy that was carried by the girl) with increased reading experience, reflecting the higher proportion of the latter in written language (Montag & MacDonald, 2015). More generally, variations in the properties of the language being learned (e.g., regarding word-order flexibility, availability of cues from gender and case markings, etc.) will result in different processing biases (e.g., O’Grady, 2011) and different rates of acquisition (e.g., Kidd et al., 2007), as children learn to integrate the multiple cues relevant for Chunk-and-Pass processing of relative clauses in their native language.
6.2 Individual Differences in Chunk-and-Pass Processing
Cross-linguistic data on the acquisition and processing of relative clauses have revealed a strong relationship between the patterns of different types of subject and object relatives as they occur in everyday language and the difficulty with which they are processed by children and adults (e.g., Arnon, 2010; Kidd et al., 2007; Reali & Christiansen, 2007a). This highlights the role of linguistic experience in shaping Chunk-and-Pass processing relative to a specific language, in some cases down to the level of particular word combinations (e.g., Arnon & Snider, 2010; Bannard & Matthews, 2008; Reali & Christiansen, 2007a). But given that the breadth and variety of linguistic experience can vary quite substantially (e.g., as a product of social, economic, and educational (p.183) factors), does this then result in substantial differences in people’s language abilities?
There are, as we have noted, indeed substantial differences in language ability—even within the otherwise fairly homogenous group of undergraduate students often used in psycholinguistic experiments (see Dąbrowska, 2012; Farmer, Misyak, & Christiansen, 2012, for reviews—see also box 6.2). These differences have been attributed to several factors, including variations in working memory capacity (e.g., Just & Carpenter, 1992), cognitive control (e.g., Novick, Trueswell, & Thompson-Schill, 2005), and perceptual processing (e.g., Leech, Aydelott, Symons, Carnevale, & Dick, 2007). Here we focus on the potential role of language experience in explaining individual differences in the processing of relative clauses.
King and Just (1991) provided some of the first systematic evidence of individual differences in relative clause processing. In a self-paced reading task, they presented participants with sentences containing subject and object relative clauses with full noun phrases (as in 1 and 2, repeated here as 23 and 24).
(23) The reporter that attacked the senator admitted the error.
(24) The reporter that the senator attacked admitted the error.
As a measure of individual differences in verbal working memory for language, they further administered a reading span test (Daneman & Carpenter, 1980) in which participants read aloud progressively larger sets of sentences, one at a time, while retaining the sentence-final words for later recall. The number of sentence-final words that could be correctly recalled by participants—their reading span—was then taken as a measure of their working memory capacity for language processing. King and Just found a main effect of working memory capacity at the main verb (admitted), where processing complexity is high: high-span individuals read sentences faster than low-span participants. They additionally obtained a main effect of sentence type: subject relatives were read faster than object relatives. Finally, they observed an interaction between working memory capacity and sentence type: while both high-and low-span participants read the subject relatives at about the same pace, the low-span participants read the object relatives significantly more slowly than the high-span individuals. King and Just interpreted these results as suggesting that only high-span individuals had sufficient working memory capacity to process object relatives without too much difficulty (see also Just & Carpenter, 1992).
MacDonald and Christiansen (2002) provided an alternative interpretation of the King and Just (1991) results, suggesting that the individual differences observed in this study derive from variations in linguistic experience rather than putative differences in working memory capacity. Specifically, they noted that when processing subject relatives (that attacked the senator), it is possible to piggyback on the processing of simple transitive sentences (the reporter attacked the senator) because both involve the canonical English (S)VO word order. In contrast, this is not possible when processing object relatives (that the senator attacked) because the object of the embedded clause (reporter) occurs before the subject, yielding a non-canonical (O)SV word order that does not map onto the structure of the corresponding simple transitive sentence (the senator attacked the reporter). This means that experience with simple transitive sentences will not facilitate the processing of object relatives. Instead, proficiency in processing object relatives therefore requires direct experience (p.185) with such constructions, whereas the processing of subject relatives can rely to a large degree on structural overlap with the frequently occurring simple transitive sentences. Brandt et al. (2009) put forward a similar argument with regard to the acquisition of relative clauses in English and German, while pointing to the opposite pattern in Chinese: object relatives follow the canonical word order, apparently making them easier to process than subject relatives (Hsiao & Gibson, 2003; O’Grady, 2011; but see Vasishth, Chen, Li, & Guo, 2013, for an alternative experience-related perspective).
6.2.2 The Frequency × Regularity Interaction in Language Processing
This experienced-based explanation of the King and Just (1991) results can be seen as exemplifying the broader relevance of linguistic experience in fine-tuning Chunk-and-Pass processing skills across development and into adulthood. The relationship between linguistic exposure and structural overlap inherent in this interpretation is, moreover, an instance of the Frequency × Regularity interaction that we mentioned in chapter 2. This interaction has been observed across many levels of linguistic processing, from auditory (Lively, Pisoni, & Goldinger, 1994) and visual (Seidenberg, 1985) word recognition, to English past tense acquisition (Hare & Elman, 1995) and aspects of sentence processing (Juliano & Tanenhaus, 1994; Pearlmutter & MacDonald, 1995). The Frequency × Regularity interaction suggests that direct exposure is needed for irregular patterns, whereas the overlap between regular patterns allows experience to be accumulated across different instances.
As an example, consider word recognition. Regularly spelled words, which have a fairly straightforward mapping from letters to sound, are easy to recognize, independent of frequency, because they can be read by virtue of the reader’s experience with similar words. Thus, even though pave is relatively rare, its recognition is facilitated by pattern overlap with similarly spelled words, such gave, save, rave, cave, shave, etc. In contrast, the recognition of irregular words is highly frequency sensitive: both pint and have are irregularly spelled words (cf. lint, mint, tint and gave, save, cave, respectively) but because pint has a low frequency of occurrence, it is much harder to recognize than have, which occurs very frequently. Crucially, Seidenberg (1985) found that amount of reading experience modulates the effects of the Frequency × Regularity interaction: skilled readers more easily recognized irregularly spelled words than poor readers. MacDonald and Christiansen (2002) suggested that differential experience similarly gives rise to the individual differences in relative clause processing observed by King and Just (1991).
Subject relatives have the standard VO word order characteristic of English; they are “regular” in the sense of the Frequency × Regularity interaction. (p.186) Chunk-and-Pass processing of subject relatives is therefore facilitated by extensive experience with the same VO structure in simple transitive declarative sentences. However, object relatives have an “irregular” OV order, so that Chunk-and-Pass processing of object relatives will be sensitive to the frequency with which those constructions occur. The amount of linguistic experience therefore should affect object relatives more than subject relatives, providing an alternative explanation of the King and Just (1991) results. From this viewpoint, participants described as having “high working memory capacity” perform better on object relatives because (a) they read more, (b) reading increases exposure to both subject and object relatives, and (c) increased exposure is more important for object relatives than subject relatives. This experience-based explanation challenges standard working memory accounts (e.g., Gibson, 1998; Just & Carpenter, 1992) but is consistent with the role of experience in reading proficiency (e.g., Stanovich & Cunningham, 1992, 1993) as well as in skill acquisition more generally (e.g., Ericsson & Kintsch, 1995).
6.2.3 Simulating the Role of Experience in Relative Clause Processing
To demonstrate the differential role of experience for subject and object relative clauses predicted by the Frequency × Regularity interaction, MacDonald and Christiansen (2002) conducted a set of computational simulations involving Simple Recurrent Networks (SRNs; Elman, 1990; as introduced in chapter 5). These networks have a set of recurrent connections that allow them to learn to process sentences, one word at a time. MacDonald and Christiansen created 10 different SRNs, each randomized with a different set of initial weights, and each exposed to a different corpus consisting of 10,000 sentences. The aim was, very roughly, to capture the fact that language learners approach acquisition with different initial conditions and are exposed to different samples of their native language. To test the potential role of the Frequency × Regularity interaction in explaining individual differences in relative clause processing, the corpora were designed so that they contained 95% simple (transitive/ intransitive) sentences and 5% sentences with relative clause constructions (equally divided between subject and object relatives5). Experience with language was manipulated by allowing the networks one, two, or three exposures to the corpus. After training, each network was tested on a separate test set, (p.187) involving ten novel subject relatives and ten novel object relatives, not seen during training.
MacDonald and Christiansen (2002) trained their SRNs on a prediction task in which they had to predict the next word in the sentence being processed. This means that the networks learned a probability distribution of possible next items given previous context. To assess performance, MacDonald and Christiansen calculated the Grammatical Prediction Error (GPE; Christiansen, & Chater, 1999), which measures the SRN’s ability to make grammatically correct predictions for a specific point in a sentence. GPE maps onto human reading times, with low GPE values reflecting a prediction of fast reading times, and high GPE values indicating slow predicted reading times. The simulation results are shown in figure 6.2 (lower panels) along with the original King & Just (1991) results (upper panels). As predicted by the Frequency × Regularity interaction, additional experience benefitted the processing of object relatives more than subject relatives. Direct experience is needed to build up efficient Chunk-and-Pass processing of object relatives, whereas the processing of subject relatives can piggyback on the processing of simple declarative sentences. Notably, the simulations revealed a pattern similar to the King and Just study6, in which less trained networks resemble “low-capacity” readers and more experienced networks look like “high-capacity” readers. This suggests that individual differences, previously attributed to variations in working memory capacity, may be better explained in terms of different amounts of linguistic experience. Thus, from a Chunk-and-Pass processing perspective, the reading span task is simply another measure of language processing skill7, rather than a measure of a dedicated working memory capacity (see also Ericsson & Kintsch, 1995; MacDonald & Christiansen, 2002; Martin, 1995).
6.2.4 Inducing Individual Differences in Relative Clause Processing
Acheson, and MacDonald (2009) directly manipulated people’s exposure to language. Specifically, by analogy to the SRN simulations, they wanted to determine whether individual differences in relative clause processing could be induced simply by providing people with more opportunity to process relative clauses. Wells et al. first used the standard self-paced reading task to assess their participants’ baseline processing of subject and object relative sentences (similar to the sentences used by King & Just, 1991). Over three separate exposure sessions, four to eight days apart, participants were then asked to
(25) The police officers that searched for the missing child discovered several homeless children in an abandoned house.
(26) The former policeman that the store manager hired caught a thief red-handed.
The exposure sentences were presented one by one, with each sentence appearing all at once on the computer screen (in contrast to the word-by-word presentation in the self-paced reading task). After the participants had read a sentence, it would disappear, and they were asked to choose between one of two statements, only one of which was compatible with the meaning of the original sentence. This comprehension probe ensured that the participants read the exposure sentences for their meaning. Participants were then tested again using the self-paced reading task (with novel stimuli) on their ability to process subject and object relative clauses four to eight days after their final exposure session. To ensure that any potential improvements in relative clause processing were due to experience with these specific syntactic constructions, a control group went through the same testing and training regime as the experimental participants but with exposure to complex sentences involving either sentential complements (as in 27) or conjoined sentences (as in 28) instead of relative clauses constructions.
(27) The angry prosecutor denied that the police had tainted the evidence from the crime scene.
(28) The police officers searched for a missing child and discovered several homeless children in an abandoned house.
These control-group training sentences were also chosen to cover similar semantic themes, use similar words, and be of similar average length. Finally, the two groups were matched on both reading span and basic reading skill prior to the exposure manipulation.
As expected, Wells et al. (2009) found that the difference in processing difficulty between the subject and object relative clauses became smaller for the participants in the experimental condition as compared to the control group, demonstrating the expected effect of relative clause experience. Figure 6.3 (upper panels) shows the improvement from before and after training for the experimental participants. Consistent with predictions from MacDonald and Christiansen’s (2002) simulations, experience with relative clause processing facilitated object relatives more than subject relatives (Hutton & Kidd, 2011, observed a similar differential experience-related effect on structural (p.190)
priming of subject and object relatives). After training, the experimental group performance closely resembled the participant group labeled as “high-capacity” readers in the King and Just (1991) study, whereas before training the very same participants looked like “low-capacity” readers. Thus, as predicted by the Frequency × Regularity interaction, variations in adults’ experience with relative clauses can explain processing differences previously attributed to working memory capacity.
So far, we have discussed evidence that highlights the role of language experience in Chunk-and-Pass processing—in particular as related to relative clauses. Even a short amount of exposure within a single experimental session can change processing patterns, as shown in studies of so-called syntactic adaptation (e.g., Farmer, Monaghan, Misyak, & Christiansen, 2011; Fine, Jaeger, Farmer, & Qian, 2013). But what cognitive mechanisms could mediate such effects of linguistic experience? In chapter 2, we pointed to sequence learning as one of the cognitive mechanisms subserving language (see also Calvin, 1994; Christiansen & Ellefson, 2002; Conway & Christiansen, 2001; Greenfield, 1991). There is a close connection between the general problem of sequence learning and Chunk-and-Pass language processing: both require the detection and encoding of elements occurring in temporal sequences (or spatio-temporal sequences in the case of sign language). Perhaps, then, variations in people’s abilities to pick up statistical regularities among sequence elements might explain some of the experience-based differences across individuals in language processing skills.
Misyak, Christiansen, and Tomblin (2010) sought to address this question by investigating the connection between the learning of nonadjacent relationships among elements in a sequence and online processing of long-distance dependencies produced by relative clauses. To quantify individual differences in sequence learning, they developed a novel experimental paradigm: the AGL-SRT task. This task integrates two previous implicit learning paradigms, combining the structured, probabilistic input of artificial grammar learning (AGL; e.g., Reber, 1967) with the online learning of a serial reaction-time task (SRT; Nissen, & Bullemer, 1987). Following Gómez (2002), the sequences all had the form, aiXbi, where ai_bi consisted of three nonadjacent pairs, in which ai was always followed by bi (i.e., a1_b1, a2_b2, a3_b3), and the middle element X was drawn randomly from a set of 24 other items. Each element in a sequence was represented by nonsense words (e.g., pel, wadim, tood), which would be presented visually on a computer screen in all caps (e.g., PEL, WADIM, TOOD). Participants heard spoken forms of the nonsense words and used the computer mouse to click on the corresponding written word on the computer screen as quickly as possible. For each of the three elements in a sequence, the participant has to choose between two nonsense words on the screen: a target and a foil. After multiple blocks of exposure to these sequences, participants showed evidence of having picked up on the nonadjacent relationship between ai_bi, slowing down their responses when presented with sequences that violated this pattern (e.g., *a1_b2).
(p.192) Given the importance of prediction for language processing (e.g., see chapter 4), Misyak et al. (2010) gave their participants a prediction task at the end of the AGL-SRT task. Specifically, participants would hear and respond to the first two words of a sequence as before, but were then asked to predict which of two written nonsense words would come next (e.g., a1X17_ where the two response options would be the target, b1, and a foil, b2). Misyak et al. observed considerable individual differences among the participants on this task (from 25 to 100% correct), which they correlated with performance on the standard self-paced reading task incorporating the same relative clause stimuli as in the Wells et al. (2009) study. Individual variation in performance on the AGL-SRT prediction task was negatively correlated with reading times at the main verb in object relatives (e.g., admitted in 24) where the long-distance dependency with the subject noun (e.g., the reporter in 24) needs to be resolved: better sequence-learning ability was associated with shorter reading times. Strikingly, when Misyak et al. divided the participants into high- and low-performing sequence learners based on whether they scored above or below chance (50%) on the prediction task, a familiar pattern of reading times for subject and object relative clauses emerged. As illustrated in figure 6.3, the pattern of reading times for high- and low-performing sequence learners (bottom panel) closely resembles that of the individuals before and after training in the Wells et al. study (top panel). Thus, individual differences in sequence learning contribute to variations in the ability to use linguistic experience to fine-tune Chunk-and-Pass processing of language (see box 6.3 for an application of this perspective to so-called specific language impairment).
Because of the processing pressures from the Now-or-Never bottleneck described in chapter 4, we might expect that a basic ability for chunking sequential input would underlie the close connection between sequence learning and language. That is, individual differences in chunking, as a fundamental memory skill, should predict variations in language processing (see also Jones, 2012; Jones, Gobet, Freudenthal, Watson, & Pine, 2014). To explore this possibility, McCauley and Christiansen (2015) devised a novel twist on a classic psychological memory paradigm—the serial recall task—to pinpoint chunking ability. They used a corpus analysis to extract sublexical units of two or three consonants, differing in their frequency of occurrence as a chunk. These consonant bigrams (pairs) and trigrams (triples) were then concatenated according to frequency into strings consisting of eight or nine letters, respectively, to be used in the recall task. An example of a high-frequency trigram string is x p l n c r n g l, whereas v s k f n r s d is a low-frequency bigram string. In order to factor out potential effects of basic short-term memory, attention, and motivation, control items were created by pseudo-randomizing the experimental (p.193)
items to minimize bigram/trigram information (e.g., l g l c n p x n r is the matching control string to the above high-frequency item). When participants were asked to recall these stimuli, McCauley and Christiansen found a strong effect of chunk frequency. Because there were no vowels in the strings, they could not be easily pronounced and thereby rehearsed. Successful recall therefore required generalizing past experience with the relevant sublexical consonant combinations during reading to the new non-linguistic context of the memory task.
McCauley and Christiansen (2015) further observed considerable individual differences in participants’ performance on the chunking task. To assess variation across individuals in language processing, the participants were also administered a standard self-paced reading task incorporating the same relative clause materials used by Wells et al. (2009). To derive an individual difference measure of chunking ability that controls for basic short-term memory span, McCauley and Christiansen calculated the mean difference in recall rate (p.194) between experimental and control items (i.e., measuring how much performance was facilitated by the presence of chunks in the experimental items). As expected, individual differences in chunking ability predicted variation in online processing of relative clauses—better chunking was associated with faster processing—with a stronger effect for object relatives than subject relatives. When participants were divided into “good chunkers” and “poor chunkers” (using a median split), the familiar differential pattern (from figures 6.2 and 6.3) of subject and object relative processing emerged. The pattern of processing for the good chunkers resembled that of the good sequence learners and more experienced readers, with relatively little difference between subject and object relatives. In contrast, the poor chunkers looked like poor sequence learners and less experienced readers, experiencing particular processing difficulty with object relatives compared to subject relatives. These results provide initial, tantalizing support for the hypothesis that basic chunking abilities might arise from a single learning and memory mechanism that deals with both linguistic and non-linguistic sequential input, through their integration into Chunk-and-Pass language processing. More research is needed, though, to fully substantiate this hypothesis, as well as to examine individual differences in chunking ability across developmental time, tracing the impact of chunking on specific aspects of language acquisition, including the early development of complex sentence processing.
In this chapter, we have explored the role of experience with language in developing and fine-tuning Chunk-and-Pass processing skills. Focusing on relative clause processing, we discussed how our ability to process specific types of subject and object relatives is strongly affected by the frequency with which they occur in natural language. Of course, other factors, such as discourse expectations, cognitive control, and memory limitations also play a role, though in the current framework these factors themselves are likely to be adapted to linguistic experience (as discussed in chapter 4). The key role of exposure to specific patterns of language was further underscored by evidence from neural network simulations and human experimentation, which suggested that individual differences previously attributed to variations in working memory capacity may emerge from variations in experience and processing architecture. In line with our evolutionary arguments in chapter 2, we considered evidence suggesting that at least part of our ability to carry out Chunk-and-Pass processing of relative clauses appears to rely on cognitive mechanisms (p.195) for sequence learning, which mediate the role of linguistic experience, likely through basic memory skills for chunking.
When evaluating the impact of linguistic experience on language processing, it is important to consider potential interactions between different types of grammatical regularities, as exemplified by the role of the Frequency × Regularity interaction in creating individual differences in relative clause processing. This means that Chunk-and-Pass processing skills are not only affected by the distribution of individual construction types but also by complex interactions between multiple, partially overlapping, syntactic patterns (of which the overlap between simple transitive sentences and subject relatives is one example). Indeed, Fitz, Chang, and Christiansen (2011) manipulated the relative frequency of different types of relative clause constructions in a neural network simulation. The results showed that the difficulty in processing a particular type of relative clause depended not only on its structural complexity, but also on its frequency in comparison with other types of relative clauses. That is, for three constructions of the type, A, B, and C, the ease with which A is processed may depend on the frequency and the nature of the overlap with B and C. Thus, the language to which we are exposed forms a complex, integrated system of constructions (as noted in chapter 2; see also Beckner et al. 2009). As the language changes across time, so, too, do our Chunk-and-Pass processing skills—though in most cases such changes will be almost imperceptible to us, like a second skin.
(3.) Periander—one of the Seven Sages of Greece—is attributed with the saying “practice does everything” (Laertius, 1853, p. 45), which is often reformulated, or perhaps plainly misquoted, as “practice makes perfect.”
(4.) Indeed, the inconsistencies between the studies of it relative clause processing by Reali and Christiansen (2007a) and Heider et al. (2014) might be due to differences between the two studies in the fine-grained statistics of their stimuli. A preliminary analysis of these items using word chunk frequencies obtained from the Corpus of Contemporary American English (COCA; Davies, 2008) reveals subtle differences in the fine-grained statistics of the two stimuli sets, when considering the frequency of the three-word chunks that VERB it vs. that it VERB (where VERB refers to a specific verb such as chased). Moreover, the lack of difference between subject and object relatives that Heider et al. observed when replicating Reali and Christiansen’s it experiment with the original stimuli may be attributed to experience-based differences between participant populations in their sensitivity to fine-grained statistics (students at an Ivy-league university vs. a state university—a possibility that Heider et al. highlight themselves).
(5.) As discussed above, naturally occurring English tends to have a ratio of 2/3 subject relatives to 1/3 object relatives involving full noun phrases (as in the King & Just, 1991, experiment). However, in order not to bias their results, MacDonald and Christiansen (2002) adopted an equal distribution of subject and object relatives in their simulations.
(6.) There may appear to be a discrepancy between MacDonald and Christiansen’s (2002) simulations of the subject relative constructions and the results from King and Just (1991): the SRNs have less difficulty with processing the final word (senator) in the subordinate clauses compared to human readers. This is probably due to variations in the length of the subject relative clauses in the King and Just materials. When the length is uniform (as in the simulations), readers tend to experience less slow-down, as can be seen in figure 6.3 below.
(7.) Note, however, that because an individual’s language processing skill reflects a number of interacting factors—including language exposure as well as a host of domain-general components—there will not be a perfect one-to-one mapping between reading span scores and linguistic experience (see Farmer, Fine, Misyak, & Christiansen, in press, for discussion). (p.196)