September 8, 2020
Language, fundamentally and inherently, constitutes social behaviour between speaker and listener. Cognitive linguistics comprise ambiguity reducing mechanisms required for successful expression and comprehension of intentions during conversation (Keysar, Barr, and Horton, 1998). Optimal communication has been proposed to operate upon Grice’s (1975) principle of cooperation; an implicit assumption that a conversational partnership is cooperative. Naturally, communicative language comprises implications beyond utterance (Sperber and Wilson, 2002).According to this thesis efficient communication is guided by consideration of, and inferences about, our partner’s communicative state of mind- also known as perspective taking (Krauss and Fussell, 1991).This harmonious perspective of communication has been challenged by empirical research demonstrating that humans often fail to take each other’s perspective into account (Keysar, 2007). By first outlining key theoretical frameworks of communication, this essay aims to critically analyse research related to language and cooperation in a healthy population. It will be argued that communication is intrinsically egocentric and requires effortful modification into collaborative processes. Moreover, it will be proposed that successful linguistic adjustment is subject to individual differences in cognition.
Central to pragmatic studies is Grice’s (1975) presumption that conversation entails standardized linguistic conduct allowing for the extraction of implications from utterance. Adherence to cooperative principles follows Grice’s ahistorical maxim mandate comprising four “rules” which, if followed by the speaker, avoid ambiguity. Conceptual design as such has been proposed to solely differentiate a target referent from other contextually available objects (Brown, 1958b, Brennan and Clark, 1996, Cruse, 1977). Grice’s quality maxim holds that speakers should avoid making erroneous references, and according to the manner maxim should form utterance in a clear, brief, and orderly fashion (Grice, 1975). Moreover, the maxim of relation postulates that only relevant information should be provided. Finally, the quantity maxim holds that speakers should not provide more than the necessary amount of information. Violations of such infomativeness have been demonstrated by studies showing that people often use perceptually salient features and more readily available lexicons even when such descriptions are over-informative (Rosch, Mervis, Gray, Johnson, and Boyes-Braem, 1976, Mangold and Pobel, 1988). Moreover, Grice’s model is problematic in that it isolates language within a single temporal position, neglecting previous interactions and contextual variability- factors implicated in everyday communication.
Similarly to Grice, Sperber and Wilson (1995, 2002) hold that our partner’s mental state is pivotal to communication. However, they argue that Grice’s relevance maxim is solely sufficient for successful inferential communication. According to this theory cognition is tailored to maximize relevance during conversation by allocating attentional resources to pertinent information. Nevertheless, this theory does not elaborate upon the processes by which speaker and listener decide which information is relevant. This discrepancy is accounted for by Clark and Marshall’s (1981) common-ground theory according to which communication depends primarily upon the assessment of mutually shared knowledge, expectations, and presumptions using several co-presence heuristics. These mental short cuts guide the speaker’s adjustment of terminology to suit the listener’s comprehension. Initially the speaker will attempt to establish a common-ground with their partner, acknowledging this the listener then signifies comprehension (Clark & Wilkes-Gibbs, 1986). Through partner’s feedback, the speaker will design their utterance accordingly- a phenomenon known as “audience design” (Clark and Murphy, 1982). Clark and Marshall (1981) classify this set of co-presence heuristics into: community membership, physical co-presence, and linguistic co-presence.
Evidence for this theory comes from expert-novice accommodation in a study by Isaacs and Clark (1987). In a postcard matching task directors were instructed to describe pre-arranged postcards on a grid so that the matcher could arrange their own postcards accordingly. Using different partnership combinations of New Yorkers and non-New Yorkers, results showed that experts generally used more proper names of buildings. This contrasted with definite descriptions primarily utilized by novices. Irrespective of the pair’s level of expertise, efficiency tended to increase across trials, as reflected by a decrease in the number of words and turns required to establish a complementary reference. Moreover, there was a 20% decrease in proper name utilization by expert directors when their matching partner was a novice. Similarly, novice directors increased their use of proper names across trials when their partner was an expert. This cooperative utterance modification implies that directors adapted their descriptive language parallel to increasing sensitivity to the matcher’s common-ground, resulting in higher communicative efficiency. However, real life conversational encounters comprise higher levels of communicative divergence challenging the generalizability of the aforementioned implications.
Further evidence for cooperative communication come from studies examining shared linguistic context (Clark and Marshall, 1981). Wilkes-Gibbs and Clark (1992) assessed transference of speaker’s collaborative processes with two different partners. Participants were allocated the role of either a director, matcher A, or matcher B. The director and matcher A were provided with depictions of 12 tangram figures and the director’s task was to descriptively guide the matcher’s card assortment to be identical to their own. The nature of matcher B’s co-presence during the director and matcher A’s interaction was manipulated. Results showed that when communicating with matcher A, director’s use of indefinite and descriptive references progressively shifted to definite and nominal references. This complements previous findings relating to reference efficiency optimization across trials guided by the establishment of common-ground knowledge (Isaacs and Clark, 1987). More importantly, the director’s use of definite and nominal references was highest during subsequent interactions with matcher B formerly occupying the position of a silent side participant. In such cases matcher B observed communication between director and matcher A in close proximity, and to the knowledge of the director could see the order of their tangram figures. This provided matcher B the privilege of sharing communicational references with the director, who subsequently adjusted their utterance accordingly. This pre-established common-ground advantage was less pronounced in matcher B’s who were in less proximity to the director during their interaction with matcher A. This demonstrates communicative utilization of pre-established common-ground knowledge, implicating adjustment of utterance in line with implicit beliefs about shared information.
Indeed, several modulating factors of perspective taking have been proposed (Brennan and Clark, 1996). Historical models of communication view lexical choices as subject to previous communicative interactions (Carroll, 1980, Clark & Wilkes-Gibbs, 1986). This process comprises negotiation, and adaptation, of a mutually comprehendible reference also known as lexical entertainment (Garrod and Anderson, 1987). Brennan and Clark (1996) demonstrated that lexical entertainment operates upon four interdependent principles. Firstly, directors utilized a previously successful reference even when it was over-informative. Secondly, speakers tended to use precedents more frequently when these were grounded. Thirdly, speakers adapted their references in response to addressee’s feedback. Finally, speakers formed partner-specific conceptual pacts which were utilized even when over-informative, and abandoned upon novel partnerships. This exemplifies that discourse negotiation does not operate upon Grice’s quantity maxim, rather, it is based upon previous encounters and partner specificity.
Nevertheless, the aforementioned line of research primarily focuses on the final stage of language production and fails to account for the stage at which common-ground is considered. Using a referential communication task Horton and Keysar (1996) found that speakers utilized common-ground information more often than privileged ground information. However, when subject to temporal constraints speakers were just as likely to use common and privileged ground information in their descriptions. These findings suggest that speakers do not predominantly rely on a common-ground utterance plan. The authors proposed that the language production system initially activates egocentric references. In line with the Monitoring and Adjustment model, speakers then scan for common-ground violations and alter these to a more appropriate, cooperative, utterance. Therefore, it may be plausible to suggest that language production is primarily facilitated on an egocentric basis, and requires effortful cognitive modification to a collaborative setting.
Evidence from language comprehension further accumulates to the notion of egocentricity. Keysar, Barr, Balin and Brauner (2000) measured listeners’ eye movement during a referential communication task. They found that listeners often fixated on objects that were occluded to the speaker’s view, despite being aware of this inaccessibility. In a similar study, Barr and Keysar (2001) found that listeners were faster at identifying objects which had previously been referred to by the speaker, than non-precedent items. This pattern of results upheld even where the precedent was announced by a different director to the original speaker. This implies that language comprehension does not necessitate common-ground establishment and is independent of the conversational partner. Moreover, the study found that listeners’ fixation was guided by expectation of precedent use even when such referents were over informative, and irrespective of their partner’s identity. This further contravenes Grice’s (1975) quantity maxim as guiding communication, and suggests that listeners’ interpretations operate upon automatic and speedy principles of availability. These findings support a perspective adjustment view which regards language processing as primarily being driven by “egocentric anchoring” in conjunction with a steadier, voluntary adjustment process (Horton & Keysar, 1996). According to this thesis language communication does not require establishment of common-ground knowledge and is instead driven by minimization of cognitive load.
Monitoring linguistic ambiguity is a complex, and cognitively demanding process. In line with this, Keysar (2007) challenges the cooperative approach of successful language communication, arguing that ambiguity minimization occurs in a relatively egocentric fashion. According to Keysar, this egocentric architecture of linguistics is driven by the domineering nature of our personal knowledge, expectations, and presumptions, followed by effortful reflection of others’ perspective. This fundamental egocentricity has been explained in terms of cognitive load (Nilsen and Graham, 2009). Using varying pictorial displays Ferreira, Slevc, and Rogers (2005) asked participants to describe target objects: homophone “bat” (mammal/sports instrument) and size contrasting versions of mammal “bat”, allowing for the examination of linguistic and non-linguistic ambiguity avoidance, respectively. Results showed that in linguistically ambiguous trials, requiring the differentiation between “bat” and “baseball bat”, speakers often referred to the target object as “bat” (40%). This contrasted with non-linguistic ambiguity trials, where speakers failed to use size contrasting modifiers for conceptually identical items only 1% of the time. This was interpreted as reflecting an earlier detection of non-linguistic ambivalence at the conceptual level of language production (Levelt, Roelofs, and Meyer, 1999). “Big bat” and “small bat” map onto a singular lemma representation as opposed to the more complex ambivalence avoidance required for “bat” and “baseball bat”, which map onto separate lemma representations, but a single phonetic representation. Thus, under linguistic constraints speakers may violate cooperative terms by failing to account for conceptual differences in utterance production.
The inherently egocentric nature of communication is further supported by studies of language in children. Piaget (1959) argued that children’s capacity to accommodate their communication to another person’s perspective does not fully develop until 7 or 8 years. In examining language production Deutsch and Pechmann (1982) found that 50% of 6 year old children, and 87% of 3 year olds described objects inadequately, resulting in interpretative ambiguity in listeners. This contrasted with just 22% of 9 year olds and 6% of adults. Similar results have been obtained in studies looking at language comprehension. Using a referential communication task, Epley, Morewedge, & Keysar (2004) tracked the eye movements of children and adults whilst following directions from a speaker. Results showed that both age groups were equally as likely to initially produce an egocentric interpretation of instructions by gazing at privileged ground, occluded, objects. However, adults were faster and more efficient at adjusting their egocentric interpretation into more appropriate, cooperative one. This implies that whilst both age groups are primarily egocentric during language comprehension, adults may be better at adjusting to cooperative terms. Moreover, Nadig and Sedivy (2002) found that children as young as 5 are able to take their partner’s perspective during both language production and comprehension when engaging in a simplified version of a referential communication tasks. This further emphasizes the role of cognitive load in modulating perspective taking performance.
Indeed, optimal performance on the referential communication task requires the operation of crucial aspects of cognition known as Executive Functions (EF). EF refers to an attentional regulatory system comprising performance optimizing processes required for efficient goal directed behaviour, these include: inhibitory control, working memory (WM), and cognitive flexibility (Miyake et al. 2000).Critical development of EF occurs between the ages of 3 and 5, and continues to develop well into adolescence (Carlson, 2005, Diamond, 2013). It may be plausible to suggest that the premature nature of EF in children is responsible for perspective taking inefficiency. For example, during a referential communication task inhibitory control may be required to inhibit the predominantly egocentric, perspective-inappropriate utterance or interpretation. Moreover, WM capacity may aid a person to hold features of a target referent and use online computation to compare these to relevant surrounding objects. Finally, cognitive flexibility may aid language processing by enabling the individual to think about a target reference simultaneously from their own perspective, as well as from their partner’s perspective and to shift thinking about these divergent perspectives. Nevertheless, Nilsen and Graham (2009) found that only inhibitory control predicted preschool children’s perspective taking performance during a language comprehension task. The authors interpreted this as reflecting the unforeseen minimal task requirements in terms of WM and cognitive shifting. Another plausible explanation is the elementary development of inhibition, and the slower advancement of WM and cognitive flexibility during early childhood (Carlson, 2005, Zelazo, Muller, Frye, and Marcovitch 2003).
Research in adults has demonstrated that tasks maximizing cognitive taxation by either higher WM, inhibitory control, and/or abstract thinking requirements result in lower perspective taking performance (Roßnagel, 2000, Kronmuller and Barr, 2007, Lin, Kesyar, Epley, 2010). Moreover, perspective taking sensitivity and egocentric suppression has been found to vary as a function of EF competency. Studies on language comprehension have demonstrated that people with higher WM and inhibition scores are better able to overcome egocentric interpretations and consider speaker’s perspective (Brown-Schmidt, 2009, Lin et al., 2010). These finding echo results from language production studies which show individual differences in perspective taking sensitivity. For example, Wardlow (2013) found that speakers with higher WM and inhibition performance were less likely to use size contrasting modifiers on privileged ground trials when directing their partner’s attention to an object. However, in a more recent study comprising more than double the sample, Ryskin, Benjamin, Tullis, and Brown-Schmidt (2015) found that only WM capacity predicted perspective taking performance in a language production task. This suggests that perspective taking is regulated by domain specific processes varying as a function conversational modality.
In conclusion, successful linguistic communication requires cooperative negotiations central to which is perspective taking. Grice’s (1975) principles of cooperation pose communicatively appealing practicalities, however, research has demonstrated that people often fail to utilize these ambiguity reducing mechanisms. Rather, conversational interactions fundamentally arise from egocentric perspectives, the inappropriateness of which requires effortful monitoring and subsequent adjustment. As such, communication does not systematically, nor exclusively, involve cooperative processes. Instead, speakers and listeners utilize the most readily available information in guiding their conveyance and understanding of intentions. Thus, perspective taking varies as a function of personal beliefs, contextual modifications, previous interactions, and partner-specific conceptual pacts. Cognitive efforts are required for the suppression of linguistic egocentricity and facilitation of cooperation. The extent of this modification is subject to individual differences in the maturation, and efficiency of EF in minimizing cognitive load.


