Gender bias and stereotypes in linguistic example sentences

This article examines the distribution of gender in arguments in example sentences in contemporary linguistics publications. Prior studies have shown that example sentences in syntax textbooks systematically underrepresent women and perpetuate gender stereotypes (Macaulay & Brice 1994, 1997, Pabst et al. 2018). Here we examine example sentences in articles published over the past twenty years in Language, Linguistic Inquiry, and Natural Language & Linguistic Theory and find striking similarities to this prior work. Among our findings, we show a stark imbalance of male (N = 12,117) to female (N = 5,571) arguments, where male-gendered arguments are more likely to be subjects, and female-gendered arguments nonsubjects. We show that female-gendered arguments are more likely to be referred to using a kinship term, to exhibit positive emotions, and to be the object of affection, whereas male-gendered arguments are more likely to have occupations, to exhibit negative emotions, and to perpetrate violence. We show that this pattern has remained stable, with little change, over the course of the twenty years that we examine, leading up to the present day. We conclude with a brief discussion of possible remedies and suggestions for improvement.*

Keywords

gender representation, implicit bias, syntax, publications, academia

1. Introduction

Constructed example sentences represent a main source of data for work in linguistics. As scientists, we are trained to consider our data to be an unbiased source of evidence for our claims about the language or construction under investigation. However, it has long been noted that such examples may reflect the biases of the researchers who construct them and of the world around us—examples tend to display heteronormative gender roles, portray mainly Western protagonists such as John and Mary, and likewise reflect the dominant white culture far more than minoritized races and ethnicities (Macaulay & Brice 1994, 1997, Bergvall 1996a,b).

In a landmark study, Macaulay and Brice (1997; henceforth M&B) analyzed example sentences in eleven then-current syntax textbooks and concluded that 'the majority of constructed example sentences in syntax textbooks are biased toward male-gendered NPs, and … contain highly stereotyped representations of both genders' (p. 822). Twenty years later, Pabst et al. (2018; published as Cépeda et al. 2021 ) conducted a [End Page 653] follow-up study of recently published textbooks in syntax to test whether the problem of gender bias has been rectified in the years since M&B published their work. These authors find that the majority of problems identified by M&B in 1997 still plague syntax textbooks today: male-gendered arguments are overrepresented in linguistic examples, and almost all of the stereotypes observed by M&B are still present in more recently published textbooks.

In this article, we go beyond textbooks to test the distribution and role of arguments in constructed example sentences with regard to gender in journal articles published in three major theoretical linguistics journals over the past twenty years—covering the period between the publication of Macaulay & Brice 1997 and the Pabst et al. (2018) presentation. In short, we observe the same issues of imbalance in example sentences used in the research literature as in textbooks. Although we observe a positive trend, at its current pace the field will not reach parity in this regard for several more decades. These findings affect all three journals we study to the same degree, suggesting that this is an entrenched way of thought in our field, which we argue must change.

Investigating the role of gender representation in example sentences fits into a broader current trend in linguistic research, one of turning the gaze inward and examining bias against underrepresented minorities in the field. In the past several years, within the LSA alone, we have seen the adoption of the 'Guidelines for inclusive language' (Linguistic Society of America 2016); 1 several presentations (e.g. Pabst et al. 2018, Kibbey 2019, Kurumada & Gardner 2019, Muller et al. 2019, Zimman 2019, Haugen & Margaris 2020); plenary events, including the panel 'Our linguistics community: Addressing bias, power dynamics, harassment' (Eckert at al. 2018), the special film showing Talking Black in America: The story of African American English (Language & Life Project 2018), and the talk 'Fostering a culture of racial inclusion in linguistics: For the children of the 9th Ward circa 2005' ( Charity Hudley 2020); special sessions such as 'Sharing our views: Native Americans speak about language and linguistics' (Leonard et al. 2018), 'A survey of linguists and language researchers: Harassment, bias, and what we can do about it' (Namboodiripad et al. 2019), 'Linguistic discrimination on the university campus' (Clements et al. 2019), 'Black becoming for language and linguistics researchers' (Lanehart et al. 2020), 'Queer and trans digital modalities' (Kibbey et al. 2020), and 'Hate speech' (Burkholder et al. 2020, Carr et al. 2020); and finally, the recent LSA Statement on Race (Linguistic Society of America 2019) and accompanying paper (Charity Hudley et al. 2018).

In short: who we hire, who we cite, and who we signal is a part of our field to our students and early career researchers has a large impact on its makeup. The shape of the world our example sentences convey to readers—students and active researchers alike—implicitly and sometimes explicitly sends powerful signals about who is welcome in our field and who is less so. This, in turn, affects the kinds of research questions that are welcome, and the kinds of answers that we expect and ultimately adopt. Limiting access to the field inevitably leads to a reduced richness of ideas, research topics, approaches, and types of data collected, and more generally it limits the reach and breadth of our field. It is thus in everyone's interest to increase our field's inclusivity.

In what follows, we present the main results of M&B 1997 and Pabst et al. 2018. We then present our current study and its results, showing pervasive and expansive gender bias in the form of overrepresentation of male individuals and stereotypes of all genders. We conclude with a brief discussion of naturalistic and corpus data in the context of elicitation [End Page 654] and discourse analysis, as well as implications for linguists and educators more broadly, and with suggestions for practical ways to improve the shape of the data we use.

2. Terminology and background

Before discussing the prior literature that motivates this study and the study itself, we begin by describing the terminology we use in this article and the choice to use a binary gender designation.

2.1. Terminology

Like the authors of the papers cited above, we rely on the notion of perceived/conceptual gender, using the following definition from Ackerman (2019).

(1) Conceptual gender: The gender that is expressed, inferred, and used by a perceiver to classify a referent (typically human, but can be extended to anthropomorphized nonhumans).

Numerous studies have shown strong gender biases of certain noun phrases, such as surgeon, CEO, nurse, cheerleader (e.g. Garnham et al. 2002 , Kennison & Trofe 2003, Duffy & Keir 2004, Gygax et al. 2008 , Kreiner et al. 2008, Pyykkönen et al. 2010). Experimental participants tend to expect the referent of nouns such as surgeon and CEO to be male rather than any other gender, while nouns such as nurse and cheerleader are biased female. These trends are robust across many studies, but such inferences are of course defeasible, indicating that these biases are tied to conceptual gender rather than to grammatical gender, defined as follows.

(2) Grammatical gender: Formal syntactic and/or semantic features that are morphosyntactically defined. 2

That is, grammatical gender is a formal syntactic property of a noun, is obligatory, and cannot be overridden. By contrast, conceptual gender is tied to the inferences speakers make and may vary by societal norms, by context, and over time. For example, studies have shown that names—including those that are often considered to be 'gender-neutral'—change over time and may become associated stereotypically with either male or female gender (see Barry & Harper 1982, 1993, Van Fleet & Atwater 1997, Lieberson et al. 2000 , Hahn & Bentley 2003, among others).

Since the articles we investigate assume (usually implicitly) that gender is a binary, we are forced into the same classification as well. As we expand on in more detail below, our classification of nouns into gender categories in example sentences relies extensively on the perception of individuals reading the examples. Consequently, any reference to 'male' arguments or 'men' is meant to identify those arguments whose conceptual or perceived gender is male, and likewise for 'female' and 'woman'. It is important to stress, however, that treating gender as if it is a binary classification is harmful to individuals who identify neither as male nor female; while there is diversity across individuals in the preferred term, such gender identities are often referred to collectively as nonbinary. We hope that one consequence of this work will be to move away from the binary classification, as well as from gender-normative roles, when there is no need for them in linguistic examples.

2.2. Macaulay & brice 1997

The study we present in §3 is inspired by two prior studies that investigate the distribution of gender in example sentences in syntax textbooks. In what follows, we present a brief summary of these studies. We begin by describing [End Page 655] that of Macaulay & Brice 1997. M&B present two studies: the first is a comprehensive study of gender representation in the example sentences in a single textbook (also published in Macaulay & Brice 1994). The second is a comparative study designed to investigate whether the gender imbalance found in the first study generalizes across other textbooks. Here we concentrate on the latter study, whose design and results inspired both Pabst et al. 2018 (see also Cépeda et al. 2021) and the study we present in §3.

M&B (1997) present an investigation of ten syntax textbooks published between 1969 and 1994. Seven textbooks had male authors, and three had female authors. Two hundred examples were randomly sampled from each textbook and manually coded for the following parameters.

(3) M&B's coding

a. Grammatical function (subject, direct object, indirect object, etc.)

b. Thematic relations (agent, patient, experiencer, recipient, etc.)

c. Lexical choices (pronouns, proper names, violence, appearance, etc.)

Here we summarize some of the main findings. We refer the reader to M&B 1997 for a more comprehensive discussion and for examples illustrating each one of the findings. In short, M&B find that example sentences introduce male protagonists at higher rates than female ones, and that they perpetuate gender biases, as summarized in 4. When female-gendered arguments are overrepresented in example sentences, this, too, is done in a way that perpetuates stereotypes, as outlined in 5.

(4) Male-gendered arguments in M&B's findings

a. appear more often as arguments

b. are more likely to be subjects and agents

c. are more often referred to using pronouns and proper names

d. engage more often in 'intellectual activities' such as book reading/handling, and feature more frequently in examples involving cars

e. have (i) more and (ii) more varied occupations

f. are more often engaged in violent activities, especially as perpetrators

(5) Female-gendered arguments in M&B's findings

a. are more often referred to with kinship terms (X's wife, mother )

b. are more likely to have their appearance described

These findings are shown in the selected examples below, representing only a small portion of those shown by M&B to illustrate their observations.

(6) Selected examples from M&B

a. Harry watches the fights and his wife the soap operas.

b. Bill is proud of his father and tired of his mother.

c. Every painting of Maja and photograph of Debbie pleased Ben.

d. The man is hitting the woman with a stick.

e. The man who shot her believed there was someone else who was seeing Helen.

In addition, the syntax textbooks studied by M&B commonly used examples that contained explicit and suggestive language.

(7) Explicit and suggestive language in example sentences from M&B

a. Max doesn't beat his wife because he loves her.

b. After Rambo as a lover, she was exhausted.

c. She's fond of John naked. [End Page 656]

d. What a nice pear Mary's got! 3

e. John forced Mary to be kissed by Bill.

Finally, the gender of the textbook author played an important role: male authors were on average much more likely to choose biased examples, whereas female authors tended toward a more balanced sample.

M&B thus conclude that '[o]ur results clearly illustrate the need for such scrutiny: females are simply not significant actors in the world constructed in most corpora of example sentences' (1997:816).

2.3. Pabst et al. 2018

The majority of M&B's findings are replicated in the study of textbooks published from 2005–2017 presented by Pabst et al. (2018) and published as Cépeda et al. 2021. Results for male-gendered arguments are summarized in 8. The only one of M&B's findings (4) that was not apparent in those of Pabst et al. (8) is that men often worked with cars.

(8) Male-gendered arguments in Pabst et al.'s findings

a. appear more often as arguments

b. are more likely to be subjects and agents

c. are more often referred to using pronouns and proper names

d. engage more often in 'intellectual activities' such as book reading/handling

e. have (i) more and (ii) more varied occupations

f. are more often engaged in violent activities, especially as perpetrators

Some findings about female-gendered arguments are replicated as well, summarized in 9. Unlike M&B (1997), however, Pabst et al. (2018) no longer find that female-gendered arguments have their appearance described more often than male ones—in fact, they find a very small number of examples that describe physical appearances in general. By contrast, they find that female-gendered arguments exhibit a greater proportion of negative emotions in their sample, a finding that was not investigated in the original M&B study.

(9) Female-gendered arguments in Pabst et al.'s findings

a. are more often referred to with kinship terms (X's wife, mother )

b. exhibit more negative emotions than male-gendered arguments

In general, Pabst et al. find very little sexually explicit or suggestive language, but they find that the majority of stereotypes about both male- and female-gendered arguments are maintained in recent textbooks. Some examples are shown below, and the reader is referred to their article (Cépeda et al. 2021) for more data.

(10) Selected examples from Pabst et al.

a. She snarled at the students who hadn't read the book.

b. Bruce loved and Kelly hated phonology class.

c. Joan believes he is a genius even more fervently than Bob's mother does.

d. Mary entertained the men during each other's vacation.

e. He drove her hard, he stole her fame or would have if he could have.

f. Mohammed buys a house.

g. The woman bought rice for the children.

h. Slavko left his wife.

i. Mary may wonder if John cheats on her. [End Page 657]

Finally, Pabst et al. find some differences between male and female authors, such that female authors tended to use a higher proportion of nongendered arguments than male authors did. One female author used a more balanced proportion of gendered arguments, but the other two used similar proportions to the men. Overall there were no statistically significant differences between the proportions of male to female arguments based on the textbook authors' genders.

2.4. Summary

To summarize, the vast majority of problems that afflicted example sentences in syntax textbooks twenty years ago are still present today. The main change has affected explicitly suggestive and stereotypical examples. While Pabst et al. (2018) find a small number of such examples, they are no longer as blatant or numerous. Instead, the discrepancies are made more difficult to detect, although they remain present: the skew now requires a broader lens to observe. Once this is done, however, Pabst et al. find a vast range of ways in which implicit gender biases are present in their sample, such that men are overrepresented and presented more favorably, and women are suppressed. They additionally point out that nonbinary identities are nonexistent in their sample, and that there is a Western bias in choice of names and contexts (see Cépeda et al. 2021 for more details on their findings).

3. Study

On the heels of these results, and following a recent study of French example sentences (Richy & Burnett 2019), in this study we examine all articles published from 1997–2018 in three leading theoretical linguistics journals: Language, Linguistic Inquiry, and Natural Language & Linguistic Theory. To foreshadow our results, we systematically arrive at findings similar to those reported in the studies above.

3.1. Methods and design

Unlike previous studies, instead of sampling, we automatically extracted all example sentences from all articles published during the time period under investigation in all three journals, using regular expressions. 4 This was made possible by the fact that constructed example sentences follow a standard formatting in linguistic work: examples are marked by a number enclosed in parentheses, and they are typically removed from the margin, perhaps also including a title, a gloss, and a translation. 5

Twenty-four Yale University undergraduate students were recruited to clean up the examples and provide coding following guidelines, which we detail immediately below. These students had all taken at least one linguistics course, and several were (or subsequently became) linguistics majors. The coding process was as follows. As a first step, coders were instructed to remove all data points that were not example sentences, but rather were mistakenly captured or were generalizations, rules, descriptions of data, or any other text that appeared in numbered examples. Next, examples that did not contain third-person arguments were removed. Next, examples with third-person arguments that did not refer to humans or anthropomorphized individuals were removed from the study. 6 Finally, for each third-person argument in each of the remaining example sentences, our coders provided information along the same lines as the studies surveyed above. [End Page 658]

(11) Argument coding in this study

a. Grammatical function (subject, direct object, indirect object)

b. Thematic relations (agent, patient, experiencer, recipient)

c. Lexical choices (pronouns, proper names, violence, physical appearance, etc.)

We decided to focus on major syntactic and semantic roles, as spelled out in 11a–b. Unlike prior work, we did not separately code minor thematic relations such as beneficiary or instrument, anticipating the data to be too sparse for analysis, as was the case in those prior studies, and since our coders had minimal prior exposure to the notion of thematic relations. Such roles were grouped under 'other' and eventually excluded from our analysis.

Coders were instructed to flag any example they considered to be stereotypical in any way. In addition, they were given a list of specific properties to look out for. Following prior work, we were interested in violence, appearances, emotions, romance, 'intellectual activities' such as writing/reading, sexually suggestive/explicit language, and cars. In addition, data for kinship terms was extracted automatically using regular expressions, as the set of such terms is sufficiently limited to easily allow such a search.

In the case of non-English examples, our coders were instructed to rely on the English translation, and additionally to inspect the glosses to ensure that the translation was a good match. 7 We were specifically concerned with possible cases of mismatches between gloss and translation, such as a nongendered third-person pronoun being translated into English as 'he'. Further, as noted above, in cases of gendered languages, we did not include in our data any inanimate objects, even if they bore masculine or feminine agreement. Only human arguments are included in our analysis.

Coders received hands-on training in the coding described here. Their work was inspected on an ongoing basis by one of the authors, and feedback was provided as needed to improve their work. We additionally conducted some post-hoc tests to verify the accuracy of the coding, such as the use of regular expressions to find examples with relevant traits and confirm that they were correctly flagged by the coders, and spotchecking of random individual examples.

Throughout the coders' work, we emphasized speed as a guiding principle. We introduced a notation for decisions that the coders were not sure of. They were encouraged to flag and skip such cases, and not spend any time investigating further through Googling or additional reading. This became particularly useful in the case of unfamiliar names, and is in part the reason for our choice to concentrate only on major thematic relations. In all cases where the 'unsure' notation was used, one of the authors followed up to add the missing details.

Finally, we note that each example contained some metadata, such as the year of publication, journal name, and example number. However, data about the articles' authors [End Page 659] was not shown to the coders, and they were not asked to identify whether authors were male or female. This was done for two reasons: first, to avoid any bias that might stem from knowing this fact during the coding process, and second, because we expected our coders to be unfamiliar with many of the authors. We added this information after the coding process was completed. Additional details about the analysis of author gender are provided in §3.4.

3.2. Results

In total, we aggregated all articles from the three journals under investigation published over the twenty years we were concerned with. Of those, 927 articles had example sentences, for a total of 22,954 examples. Overall, 813 articles contained at least one gendered human argument, resulting in a total of 25,106 third-person human arguments for our analysis. We first show results for each of the properties listed in 11, and in the next section discuss meta-analyses (i) over time, (ii) by language of the example, and (iii) by author gender. All results shown here are statistically significant, as confirmed by Pearson's χ2 tests with Yates's continuity correction, unless otherwise noted.

Distribution of arguments

We begin by considering the overall distribution of arguments in our sample, shown in Figure 1. Of the 25,106 third-person human arguments identified in our study, 7,418 were nongendered or ambiguous (A), 8 5,571 were coded as female, and 12,117 were coded as male. Therefore, female arguments make up 22% of all third-person arguments in our sample, ambiguous arguments make up 30%, and male arguments make up the remaining 48%.

Figure 1. Gender distribution of arguments in all example sentences in this study (F: female, M: male, A: ambiguous).
Click for larger view
View full resolution
Figure 1.

Gender distribution of arguments in all example sentences in this study (F: female, M: male, A: ambiguous).

In the remainder of this article we concentrate on arguments that were perceived as either male or female. Of the 17,688 male/female-gendered arguments we coded, 31% were female and 69% were male, for a ratio of 2.2 male arguments for every 1 female argument. These ratios are consistent across the three journals that we studied, as shown in Figure 2 .

Specifically, we find that the ratio of female-gendered arguments in the data to the total number of arguments is 32% for articles published in Natural Language & Linguistic Theory , 31% for articles published in Linguistic Inquiry, and 31% for articles [End Page 660]

Figure 2. Gender distribution of arguments in all example sentences by journal (LG: Language, LI: Linguistic Inquiry, NLLT: Natural Language & Linguistic Theory).
Click for larger view
View full resolution
Figure 2.

Gender distribution of arguments in all example sentences by journal (LG: Language, LI: Linguistic Inquiry, NLLT: Natural Language & Linguistic Theory).

published in Language. These gender ratios were not found to be significantly different across journals; therefore, we show combined results for all three journals in the sections that follow.

Grammatical function and thematic relations

Next, we consider the distribution of arguments with respect to syntactic and semantic roles. We find that 83% of male arguments are subjects (9,033 of 10,861), while only 79% of female arguments are subjects (3,724 of 4,738). That is, female-gendered arguments are less likely to occur as subjects and more likely to occur in nonsubject roles (namely direct and indirect objects) as compared to male-gendered arguments. Similarly, we observe a skew in thematic relations: female arguments represent 30% of agents (2,714 of 9,099), 30% of experiencers (1,110 of 3,736), 35% of patients (1,193 of 3,431), and 42% of recipients (338 of 798). Since female arguments comprise 31% of the sample overall, this means that they are overrepresented among patients and recipients. These findings are summarized in Figure 3.

Figure 3. Distribution of arguments by grammatical function (left) and thematic role (right).
Click for larger view
View full resolution
Figure 3.

Distribution of arguments by grammatical function (left) and thematic role (right).

That is, women are described often as passive observers in a male-dominated world. They are not initiators of actions but rather are more likely on the receiving end. They lack independent agency. [End Page 661]

Pronouns and proper names

We examine next the distribution of pronouns and proper names in our sample in order to explore whether some of the skew in our sample could be specifically attributed to these categories. In short, we find a small skew with respect to pronouns and no skew where proper names are concerned. We conclude that the skews we observe are more fundamental and not attributable to one class of examples alone.

Turning first to pronouns, we find that male-gendered pronouns are significantly more common than female-gendered ones. Male-gendered pronouns make up 29% of all male-gendered arguments, while female-gendered pronouns make up only 23% of female arguments. We tentatively suggest that this may be a side effect of the overall prevalence of male-gendered arguments, especially subjects, in the data. If a pronoun were to refer back to an argument, that argument is more likely to be male than female. 9

Figure 4. Pronouns (left) and proper names (right) in the data.
Click for larger view
View full resolution
Figure 4.

Pronouns (left) and proper names (right) in the data.

Concentrating more closely on names, we observed a total of 10,743 names in our study. Names were coded for gender one of two ways. If a name was referred to using a gendered pronoun, for example, Taylor looked at a picture of herself, it was gendered accordingly. Names without referential pronouns were coded for gender based on coders' judgments of that name's stereotypically perceived gender. Of the personal names identified in the data, 428 were identified as referring to nongendered or ambiguously gendered arguments. The remaining 10,315 names were coded as either male or female, as represented in Figure 4. Female-gendered arguments comprised 32% of names (3,263 of 10,315) and 31% of nonname arguments (2,305 of 7,360), similar ratios to the 31% of female-gendered arguments in the data set as a whole. Likewise, the proportion of gendered arguments that were proper names was similar for male and female arguments, at 58% and 59%, respectively, a difference that was not found to be significant.

To achieve maximal accuracy in this section, we manually examined all names not familiar to our undergraduate student coders. We did not use an automatic gender classifier because we did not trust the ones currently available: these classifiers tend to have a bias toward Western names, and they do not typically do well with ambiguous names. They also would not be able to deal with the cooccurrence of a pronoun in the sentence [End Page 662] that might disambiguate the gender of a name, nor with information available in the gloss or translation of a non-English example.

We coded a name as male or female if (i) we were familiar with the name and language, (ii) the gloss or translation explicitly stated that the name was gendered, (iii) the gloss indicated gendered agreement, (iv) there was a gendered pronoun corresponding to the name in the example, (v) an online search indicated that the name was clearly used only for one gender, or (vi) a community of linguist experts we consulted indicated their familiarity with the language and name and confirmed that the name was gendered. 10 The top five names in each gender category we coded are presented in Figure 5.

Figure 5. Top five most common male (left) and female (right) names.
Click for larger view
View full resolution
Figure 5.

Top five most common male (left) and female (right) names.

Among the top names, we find that 30% of all male names are John. Concomitantly, 31% of all female names are Mary. In fact, two of the top five male names are 'John' variants, John and Juan, whereas three of the top five female names are 'Mary' variants, Mary, Maria, and Marie. That is, we observe a strong Western/Christian bias in authors' choices, with little variability. This applies to name choices regardless of gender.

Lexical choices

We also consider various lexical choices about the example sentences in our study. We start with examples that refer to an argument's occupation. We find that male-gendered arguments are overrepresented in such examples. While male arguments outnumber female arguments 2.2:1 overall, the ratio is close to 3:1 among arguments described as having professions (i.e. 74% are male), as shown in Figure 6.

Next we examine examples that involve violent events of some kind. Observing first the overall distribution of arguments, we find that male-gendered arguments are massively overrepresented in such examples: 84% of the arguments in these examples are male (467 of 559, a ratio higher than 4:1). Within each gender category, however, we find a similar proportion of subjects and nonsubjects in sentences describing violence. Subjects comprise 72% of all male-gendered arguments (335 of 467) and 68% of all female-gendered arguments (63 of 92). These findings are shown in Figure 7.

Turning to examples involving the expression of romantic or sexually suggestive content, we now find that the gender distribution in examples is remarkably different. [End Page 663]

Figure 6. Arguments whose occupation is discussed in the example.
Click for larger view
View full resolution
Figure 6.

Arguments whose occupation is discussed in the example.

Figure 7. Violent events by gender (left) and by gender × syntactic position (right).
Click for larger view
View full resolution
Figure 7.

Violent events by gender (left) and by gender × syntactic position (right).

Here, female-gendered arguments are overrepresented, comprising 50% of all arguments (204 of 406). Taking into account that male-gendered arguments are generally overrepresented at a 2.2:1 ratio compared to female-gendered arguments, this constitutes a remarkable skew. Importantly, this difference interacts with grammatical function. The overrepresentation of female arguments in these sentences is also seen in their greater overrepresentation as nonsubjects. Only 58% of female-gendered arguments are subjects (118 of 204), whereas 76% of male-gendered arguments are subjects (153 of 202). To state it another way, 44% of subjects (118 of 271) and 64% of nonsubjects (86 of 135) are female. That is, female arguments occur more frequently in sentences with romantic/sexual content than in the data set as a whole, and also appear more often in those sentences as nonsubjects. See Figure 8.

Finally, female-gendered arguments are massively overrepresented among those referred to by kinship terms: 56% of all such arguments are female-gendered (420 of 744), as seen in Figure 9. Considering again the overall 2.2:1 male skew in the data, this is a particularly striking finding.

We also coded for example sentences relating to books and other intellectual activities. There were 1,385 such examples, and of these 324 were associated with female-gendered arguments, 679 were associated with male-gendered ones, and 382 were [End Page 664]

Figure 8. 'Romantic' events by gender (left) and by gender × grammatical function (right).
Click for larger view
View full resolution
Figure 8.

'Romantic' events by gender (left) and by gender × grammatical function (right).

Figure 9. Use of kinship terms by gender.
Click for larger view
View full resolution
Figure 9.

Use of kinship terms by gender.

associated with nongendered arguments. As this distribution does not differ from the overall gender ratio, we do not comment further on these results. 11

Sentiment analysis

Finally, we use the R function 'get_sentiments' in the 'tidytext' package (Silge & Robinson 2016) to run sentiment analysis on the data. We find that male arguments are overrepresented in sentences conveying negative emotions such as anger and fear, while female arguments are overrepresented in sentences conveying positive emotions such as joy and trust . 12

Predicates were counted once for each gender-coded argument associated with them; that is, a predicate with gendered subject and object arguments is counted twice. Sentiments were determined using both the 'Bing' (Liu 2012) and 'NRC' ( Mohammad & Turney 2013) methods of categorization, with corresponding lexicons. The Bing method bins predicates into binary positive and negative sentiment categories, while the NRC method categorizes predicates into ten distinct groups: anger, anticipation, disgust, [End Page 665] fear, joy, sadness, surprise, trust, and negative and positive categories that function as an 'elsewhere' case, as each predicate is assigned to only one of these ten categories. If a predicate is not included in the lexicons of the method in use, the 'get_sentiments' function excludes it. Specifically, of the 17,688 predicates in the data, only 2,389 predicates were categorized by the Bing method, and 11,374 predicates were categorized by the NRC method. The rest of the predicates are excluded from the analysis in this section.

Results from the Bing categorization of sentiments, shown in Figure 10, reveal a slight skew in the genders of arguments in positive versus negative sentences. While the overall male-to-female ratio of arguments is 2.2:1, the gender ratio for negative sentiments is 2.5:1, indicating a skew toward male arguments in these sentences. Conversely, female arguments are slightly overrepresented in positive sentences; the gender ratio for positive sentiments is 1.7:1.

Figure 10. Positive/negative sentiments categorized by the Bing method, by gender.
Click for larger view
View full resolution
Figure 10.

Positive/negative sentiments categorized by the Bing method, by gender.

This trend continues to be borne out in the results of the NRC method of analysis. Figure 11 shows the male-to-female argument ratios for each of the ten sentiments identified with this method, with the black line showing the overall gender ratio of 2.2:1 for reference. The negative sentiments—fear, anger, negative, sadness, and disgust—have higher ratios (to the right of the black line), indicating an overrepresentation of male arguments in these sentences. The ratio is as high as three male arguments for every female argument, for fear and anger. The clearly positive sentiments—positive, trust, and joy—show the opposite trend, with lower-than-average ratios indicating an overrepresentation of female arguments. This also holds for anticipation, which is not clearly positive or negative; finally, surprise is about at the average ratio given the general 2.2:1 distribution of male-to-female arguments.

These results are in keeping with previous results in our study, showing an overrepresentation of male arguments in violent sentences, and overrepresentation of female arguments in romantic/sexual sentences. Female arguments are more often used in sentences concerning positive emotions, while male arguments are more often used in sentences conveying negative, and especially violent and angry, sentiments.

3.3. Some illustrative examples

Like in Pabst et al. 2018 and unlike the earlier M&B 1997 study, we do not find many examples with sexually explicit or suggestive language. Nonetheless, we find that stereotypes of both genders are commonly used in linguistic example sentences. We provide a sample of such sentences below. [End Page 666]

Figure 11. Male-to-female argument ratios by sentiment, categorized by the NRC method; black line indicates overall M-to-F ratio of 2.2:1.
Click for larger view
View full resolution
Figure 11.

Male-to-female argument ratios by sentiment, categorized by the NRC method; black line indicates overall M-to-F ratio of 2.2:1.

Most of the examples in 12 reflect more than one type of bias or stereotype. For example, examples 12a–c show stereotypical choices involving the object of the verb wash, as well as stereotypical gender roles. Examples 12d–e describe a woman's mental state as unstable. Examples 12f–j suggest that women are not as intellectually capable as men: men are Nobel Prize winners, professors, students, geniuses, and—remarkably—linguists. Examples 12j–l illustrate stereotypical uses of kinship terms. Examples 12l–n exemplify various references to violence. Finally, the remaining examples illustrate other stereotypical lexical choices. These examples also attempt to roughly reproduce the overall 2:1 skew of male to female subjects in our sample.

(12) Some stereotypical examples found in our study

a. John ate the meal and Mary cleaned the dishes.

b. John didn't eat the meal because he would have to clean the dishes.

c. John (not Peter) washed cars well.

d. John told Bill that Mary began to cry without any reason.

e. *Kelly broke again tonight when she did the dishes.

f. Which Nobel prize winning author came in his car?

g. The students are all the boys.

h. At least one student of every professori is horrified at hisi grading procedure.

i. No linguist1 here recommended some of his1 own books, but I don't know which of his1 own books.

j. Ray1 mother thinks he1 a genius.

k. Aoyama's sister-in-law knitted a scarf.

l. An Iraqi father drowned his 17 year old daughter.

m. Rabe forced women to wash clothes.

n. to leave the maiden … unmolested

o. Married him, didn't she/*Marge/%the gold digger?

p. Bill won't go to the bar and James to the liquor store.

q. Mary thought that it pleased John [pro to speak his/*her mind].

r.

i. ??John seems considered a fool.

ii. Also, Anne Elliot seems considered a spinster by everyone, including herself … [End Page 667]

Here is it important to note that we are not cherry-picking examples. These are representative of the examples flagged by our undergraduate coders. We furthermore note that the stereotypes we observe lean in both directions: all individuals are cast into stereotypical roles.

3.4. Meta-analysis

Finally, we comment on three additional aspects of the data we have collected.

The language of the examples

First, we note that whether the example was in English or another language did not affect the results we observe. We find a total of 33% female-gendered arguments in English examples, and 30% female-gendered arguments in non-English examples. See Figure 12.

Figure 12. Gender distribution of arguments in all example sentences in this study.
Click for larger view
View full resolution
Figure 12.

Gender distribution of arguments in all example sentences in this study.

We believe that this, again, is suggestive of a broad issue in our field. The bias is not introduced by non-English examples, where some constraints on data collection may affect the sentences tested by field linguists in various ways. Instead, English and non-English examples show a similar bias, indicating that access to data or speakers, or familiarity with the language, does not significantly affect the gender distribution of arguments.

Gendered arguments over time

Next, considering the distribution of male versus female arguments in example sentences over time, we notice a positive trend over the past twenty years, as seen in Figure 13. However, at no point—in no year over the twenty of this study—were the genders at parity. On average, we have moved 3% or so closer to parity over the course of two decades. At this pace, all things being equal, we would not expect to reach equality in the use of male and female arguments for at least fifty more years.

Moreover, when comparing the distribution of arguments by syntactic position over time, as shown in Figure 14, we observe that the increase in Fig. 13 may be attributed to an increase in the proportion of female objects in example sentences over time. In fact, the proportion of female subjects appears to have slightly decreased over time and remains below 40% in the entire time period we study.

The gender of authors

Finally, we ask whether the gender of the authors of the articles we examined had an effect on the choices these authors make in their examples. Specifically, we consider whether the gender of an article's author(s) has an effect on (i) the use of male versus female arguments, or (ii) the use of gendered versus nongendered [End Page 668]

Figure 13. Gender distribution of arguments over time.
Click for larger view
View full resolution
Figure 13.

Gender distribution of arguments over time.

Figure 14. Gender distribution of subjects (left) and objects (right) over time.
Click for larger view
View full resolution
Figure 14.

Gender distribution of subjects (left) and objects (right) over time.

arguments such as the student, the children, and so forth. We manually classified all authors of the articles we examined into the binary conceptual gender categories of male and female based on their names at the time of publication, and additionally retained information about whether each author was the first author of the article. 13

In the three journals considered in this study, the proportion of female-authored articles has risen over time but has never exceeded 42% of all articles published. In fact, the graph in Figure 15 suggests that the upward trend in the proportion of female authors levels out around 2005 and remains stable for the next twelve years we examined.

To analyze the effect that author gender has on the types of arguments used, we constructed two logistic mixed-effects models in R (R Core Team 2013) using the 'lme4' package (Bates et al. 2015). The first model concerns the use of male versus female gendered arguments. The dependent variable in this model was a binary factor indicating gender as 'male' or 'female'; author gender was included as a fixed effect, with a random effect of individual author to control for individual variation. 14 The second [End Page 669]

Figure 15. Proportion of female first authors over time.
Click for larger view
View full resolution
Figure 15.

Proportion of female first authors over time.

model includes the same fixed and random effects, but the dependent variable consists of arguments categorized as 'gendered' (male or female) or 'nongendered'/ambiguous.

Figure 16 shows the distribution of male, female, and nongendered arguments by gender of author. Male-authored articles have a higher total number of arguments than female-authored ones because of the overall disparity in publishing rates, and not because of any systematic gender differences in use of example sentences. Both male and female authors overrepresent male arguments in their example sentences relative to female ones. However, while female authors write example sentences with female arguments in 35.5% of gendered arguments, male authors include female arguments only 31.7% of the time. This represents a statistically significant (z = |2.39|, p < 0.05) difference in gendered argument ratios depending on the author's gender.

We also consider whether an author's gender is a good predictor of the use of nongendered arguments, such as nongendered common nouns and plurals. 15 Female authors use nongendered arguments in 20.8% of their example sentences, while male authors use them 23.6% of the time. This difference was not found to be significant (z = |1.03|, p > 0.1).

Figure 16. Distribution of argument gender, by gender of first author.
Click for larger view
View full resolution
Figure 16.

Distribution of argument gender, by gender of first author.

[End Page 670]

3.5. Summary

To summarize our main findings, male-gendered arguments are overrepresented in the sample overall and in particular as subjects and agents. In addition, both men and women occur in many examples stereotypical of their gender. These results are consistent across journals and time, and replicate findings for textbooks in prior work. Our findings are summarized in 13–15.

(13) Main findings: male-gendered arguments

a. appear twice as often as arguments in total

b. appear more often as subjects and agents

c. engage in significantly more violence

d. have significantly more occupations

e. tend to exhibit more negative emotions

(14) Main findings: female-gendered arguments

a. are overrepresented as nonsubjects, especially as recipients

b. are overrepresented in sentences involving romantic/sexual language

c. are massively overreferred to using kinship terms

d. tend to exhibit more positive emotions

(15) Main findings: other trends in the data

a. no effect of language of example (English vs. non-English) on the results

b. a small trend of improvement in gender ratios: female-gendered arguments grow from around 30% to close to 35%, attributable to an increase in the proportion of female objects over time

c. few or no overtly suggestive or explicit examples

d. persistent gendered stereotypes very much evident

4. Discussion and conclusion

Example sentences are one of the main sources of data in theoretical linguistics. Some examples become enshrined as 'canonical', often divorced from their original sources. As scientists we are trained to regard such data as an impartial, empirical source of evidence in support of our arguments, whose ultimate goal is to further our understanding of the language faculty. However, we often ignore the social aspects that these examples occur in and that they exemplify. We have demonstrated here that constructed example sentences used in the linguistic literature may encode implicit biases (even at a very subtle level). These then get handed down to new generations of linguists, perpetuating a cycle.

This article provided a comprehensive study of the distribution of conceptual gender in constructed example sentences. We chose three leading journals in theoretical linguistics: Language, Linguistic Inquiry, and Natural Language & Linguistic Theory, and investigated all articles published in these journals from 1997–2018. As the results in §3 show at length, we find that gender bias permeates many aspects of these examples: from the fact that male-gendered arguments are overrepresented at a 2.2:1 ratio compared to female arguments, to a multitude of gender stereotypes affecting both genders, to the fact that this trend is consistent and unchanged over time, across the three journals investigated, and by the language of the example.

As we noted at the outset of this article, we take the makeup of example sentences—the arguments they use and the predicates in them—to be signals to students and researchers alike about what we take the world to be like: who is a free-thinking agent, a genius, or likely to be a professor or a student; and who is a recipient of others' actions or belongings, the object of their affections, a caregiver, or a spouse. While not blatant like in previous decades, the bias is nonetheless extensive and pervasive. It sends a powerful message about who is welcome in our field and who is less so. This, in turn, [End Page 671] affects the kinds of research questions that are welcome and the kinds of answers that we expect and ultimately adopt. It should be in everyone's interests to increase the inclusivity of our field. The more diverse we are as a field, the richer our ideas, research topics, approaches, types of data collected, and solutions.

The gender biases we observed in our study do not occur in isolation: rather, they reflect biases that are prevalent across our society at large, and in the field of linguistics more broadly. For example, studies have found similar biases in English-language textbooks for various dialects of English (e.g. Bergvall 1996a, Lee & Collins 2010, Lewandowski 2014, Tarrayo 2014) and recently in example sentences in French journals (Richy & Burnett 2019). The issue has garnered some attention in recent LSA presentations, panels, and workshops, as cited in the introduction. It has featured in discussions of the representation of gender more broadly, including in reference to language acquisition (e.g. Eckert & McConnell-Ginet 2013, McConnell-Ginet 2014, Leslie et al. 2015, Meyer et al. 2015, Bian et al. 2017, among others).

This trend may also be seen as part of a broader trend sometimes referred to as the 'leaky pipeline', whereby fewer and fewer nonmale individuals are represented in an academic field the higher the rank of the individual (see e.g. Valian 1998, 2005, Goulden et al. 2011, among many others). Recently, some work has shown that this trend can be observed in linguistics, too, as part of a general trend of less favorable conditions for nonmale individuals—for example, in the LSA's 'The state of linguistics in higher education: Annual report 2019' 16 (Linguistic Society of America 2020) and the University of Michigan's 'Survey of linguists and language researchers'. 17 We join the many authors who study these issues in the belief that the field of linguistics would benefit from more inclusive citation, as well as hiring and promotion practices.

Nonetheless, as scientists, we believe that we could and should strive to avoid perpetuating bias, even if implicit, in our work. We argue here that better-constructed example sentences, using inclusive language, can send an important message to the field: inclusive language encourages participation from underrepresented groups, leading to a better community and therefore to better science, at the cost of just a little more thoughtfulness.

In the remainder of this section we describe several concrete actions that we, as researchers and teachers, can take to improve on the current state of the field. We additionally refer the reader to the section 'Actions people can take' on the University of Michigan website (see n. 17).

First and foremost, as an author and teacher, pay attention to the ratio of gendered arguments in your example sentences, and to the distribution of grammatical functions and thematic relations of those arguments. Strive for equity in the examples you construct or choose to cite. Consider in general the nature of the world that is portrayed through your example sentences and any stereotypes or misconceptions they may inadvertently fall into. Consider inclusivity beyond binary genders and beyond heteronormative gender roles. Your examples could additionally serve to send a message about the diversity of races and ethnicities.

Avoid the use of gendered lexical items such as -man and he where not necessary. Adopt and encourage instead the use of singular they as a more inclusive pronoun when referring to (singular) nouns whose gender is unknown. Consider using singular they even when the argument's gender is known, but is irrelevant to the example (see e.g. [End Page 672] Bjorkman 2017, Ackerman 2019 , Bradley et al. 2019). Use inclusive nouns such as Congressperson and humankind. Keep the LSA's 'Guidelines for inclusive language' (Linguistic Society of America 2016) in mind when writing, as well as when reviewing and editing papers.

When using names in example sentences, consider using diverse names, paying explicit attention to the distribution of gender in your examples. We do not advise here the use of 'gender-neutral' names, given that existing research shows that such names are often not truly perceived as neutral, and furthermore that the perceived gender of a given name may change over time (see Barry & Harper 1982, Van Fleet & Atwater 1997, Lieberson et al. 2000, Hahn & Bentley 2003, among others). Your examples could include non-Western names as an additional signal of diversity. Sources for diverse names include the database of names compiled by Sanders et al. (2020), which provides names for every letter of the English alphabet from different languages and cultures, categorized by gender (feminine, masculine, nonbinary), and Kirby Conrod's list of nonbinary names. 18

When citing the existing literature, notice the trends represented in your examples. Where possible, consider citing a different source for better or more balanced examples. Likewise, when possible, you may choose to paraphrase an original example to avoid stereotypes and give the citation as 'following' or 'minimally changed from' the original. We acknowledge that in some cases this may not be possible or it may be difficult, for example, when citing literature on less-studied languages with fewer resources. In such cases, look for ways to offset any imbalances introduced in your examples in the rest of the text.

We acknowledge next and discuss two concerns that frequently arise in this context: How would one navigate the issues we have raised here (i) during fieldwork elicitation, and (ii) in analyzing naturalistic corpus data? Unlike in constructed example sentences, where the linguist directly composes the data, data in both scenarios is constrained by additional factors.

Concentrating first on data collected in a fluid elicitation scenario, researchers should be mindful about gathering data that does not deal with sensitive or harmful topics, especially when the subject matter under discussion does not affect one's scientific aims. Similar to fully constructed example sentences, sentence elicitation requires presenting a collaborator with small variations of the same sentence to tease apart linguistic differences. Before going on an extended inquiry, consider if there is implicit bias or stereotyping in your baseline sentence. When conducting fieldwork in general, consider redirecting the topic of discussion to something equal in evidentiary value while avoiding perpetuating stereotypes, whenever appropriate to do so.

Considering next corpus data, such data may comprise a variety of text types, such as narratives and conversations. When working with data drawn from corpora, we recommend exercising choice in which examples you cite, wherever that is possible. If multiple examples are available that equally illustrate the concept under study, be mindful in selecting data that avoids enshrining social biases into an analysis. We readily acknowledge that our recommendations for both fieldwork and corpus work are an ideal, and [End Page 673] will not always be possible. However, we expect that mindfulness and explicit planning of elicitation and corpus searches that take these issues into account will help mitigate at least some observed biases in current work.

Before concluding, we return to the limitations of our study. First, as noted at the outset, we relied on conceptual (perceived) gender and assume a gender binary. This clearly does not represent the reality of gender outside of the linguistic literature, but it is a faithful representation of assumptions made within this literature. We furthermore do not examine the representation of race, nonbinary individuals, noncisgender or nonheterosexual representations, or any other minoritized or marginalized representation of individuals.

We additionally restricted ourselves to published work and did not, for example, study unpublished manuscripts, preprints, or prior versions of published papers, nor conference presentation handouts or slides, or even conference proceedings papers. Moreover, the choice to focus our attention on example sentences means that the majority of our data comes from works published in the fields of syntax and (to a lesser extent) semantics. We acknowledge this limitation of our work, but believe that the findings here are relevant to any linguist who engages with example sentences in any capacity—be it in their research or in their teaching, including, importantly, in introductory linguistics courses.

Finally, we concentrated here on journals that publish mainly constructed examples. In preliminary work leading up to this study, we considered comparing such examples to those from corpora and naturally occurring speech. In particular, we mined data from the journals Language Documentation and Description and Language in Society following the same procedures described in §3.1. However, we found the data in both journals to (i) use far fewer example sentences in general, (ii) be more heavily composed of first- and second-person arguments, and (iii) contain more naturalistic dialogue not easily parsed into separate sentences. On the whole we decided that it would not be possible to compare the latter journals to the other three represented in this study, and instead chose to leave such a comparison to a separate future study. However, we acknowledge that this is a potential factor that could affect the distribution of gender of arguments, which could be interesting and important to consider.

Hadas Kotek
Massachusetts Institute of Technology
Rikker Dockum
Swarthmore College
Sarah Babinski
Yale University
Christopher Geissler
Yale Universit
[Received 17 August 2020;
revision invited 9 December 2020;
revision received 4 February 2021;
accepted pending revisions 28 March 2021;
revision received 29 March 2021;
accepted 29 March 2021]

REFERENCES

Ackerman, Lauren. 2019. Syntactic and cognitive issues in investigating gendered coreference. Glossa: A Journal of General Linguistics 4(1):117. doi:10.5334/gjgl.721.
Barry, Herbert, III, and Aylene S. Harper. 1982. Evolution of unisex names. Names 30.15–22. doi:10.1179/nam.1982.30.1.15.
Barry, Herbert, III, and Aylene S. Harper. 1993. Feminization of unisex names from 1960 to 1990. Names 41.228–38. doi:10.1179/nam.1993.41.4.228.
Bates, Douglas; Martin Mächler; Ben Bolker; and Steve Walker. 2015. Fitting linear mixed-effects models using lme4. Journal of Statistical Software 67.1–48. doi:10.18637/jss.v067.i01 .
Bergvall, Victoria L. 1996a. Humpty Dumpty does syntax: Through the looking-glass, and what Alice found there. Natural Language & Linguistic Theory 14.433–43. doi:10.1007/BF00133689.
Bergvall, Victoria L. 1996b. 'Merely data'? Gender bias in constructed linguistic examples. Paper presented at the 70th annual meeting of the Linguistic Society of America, San Diego.
Bian, Lin; Sarah-Jane Leslie ; and Andrei Cimpian. 2017. Gender stereotypes about intellectual ability emerge early and influence children's interests. Science 355.389–91. doi:10.1126/science.aah6524.
Bjorkman, Bronwyn. 2017. Singular they and the syntactic representation of gender in English. Glossa: A Journal of General Linguistics 2(1):80. doi:10.5334/gjgl.374.
Boroditsky, Lera; Lauren A. Schmidt ; and Webb Phillips. 2003. Sex, syntax, and semantics. Language in mind: Advances in the study of language and thought, ed. by Dedre Gentner and Susan Goldin-Meadows, 61–80. Cambridge, MA: MIT Press.
Bradley, Evan D.; Julia Salkind; Ally Moore; and Sofi Teitsort. 2019. Singular 'they' and novel pronouns: Gender-neutral, nonbinary, or both? Proceedings of the Linguistic Society of America 4:36. doi:10.3765/plsa.v4i1.4542.
Burkholder, Ross; Veena Patel; and Jason Riggle. 2020. Flame war: The context of hate speech in online games. Paper presented at the 94th annual meeting of the Linguistic Society of America, New Orleans.
Carr, Christine; Melissa Robinson ; and Alexis Palmer. 2020. Improving hate speech detection precision through an impoliteness annotation scheme. Paper presented at the 94th annual meeting of the Linguistic Society of America, New Orleans.
Cépeda, Paola; Hadas Kotek; Katharina Pabst; and Kristen Syrett. 2021. Gender bias in linguistics textbooks: Has anything changed since Macaulay & Brice 1997? Language 97(4).678–702.
Charity Hudley, Anne H. 2020. Fostering a culture of racial inclusion in linguistics: For the children of the 9th Ward circa 2005. Plenary presentation at the 94th annual meeting of the Linguistic Society of America, New Orleans.
Charity Hudley, Anne H.; Christine Mallinson; Mary Bucholtz; Nelson Flores; Nicole Holliday; Elaine Chun; and Arthur Spears. 2018. Linguistics and race: An interdisciplinary approach towards an LSA statement on race. Proceedings of the Linguistic Society of America 3:8. doi:10.3765/plsa.v3i1.4303.
Clements, Gaillynn; Christina Higgins ; Okim Kang; Melinda Reichelt; and Walt Wolfram. 2019. Linguistic discrimination on the university campus. Panel at the 93rd annual meeting of the Linguistic Society of America, New York City.
Duffy, Susan A., and Jessica A. Keir . 2004. Violating stereotypes: Eye movements and comprehension processes when text conflicts with world knowledge. Memory & Cognition 32.551–59. doi:10.3758/BF03195846 .
Eckert, Penny; Sharon Inkelas; Gregory Ward; Kristen Syrett; Itamar Francez; Anne H. Charity Hudley; and Kathryn Campbell-Kibler. 2018. Our linguistics community: Addressing bias, power dynamics, harassment. Special panel at the 92nd annual meeting of the Linguistic Society of America, Salt Lake City. Online: https://www.linguisticsociety.org/sites/default/files/Our%20Linguistics%20Community.pdf .
Eckert, Penny, and Sally McConnell-Ginet. 2013. Language and gender. Cambridge: Cambridge University Press.
Garnham, Alan; Jane Oakhill; and David Reynolds. 2002. Are inferences from stereotyped role names to characters' gender made elaboratively? Memory & Cognition 30.439–46. doi:10.3758/BF03194944 .
Goulden, Marc; Mary Ann Mason; and Karie Frasch. 2011. Keeping women in the science pipeline. The ANNALS of the American Academy of Political and Social Science 638.141–62. doi:10.1177/0002716211416925.
Gygax, Pascal; Ute Gabriel; Oriane Sarrasin; Jane Oakhill; and Alan Garnham. 2008. Generically intended, but specifically interpreted: When beauticians, musicians, and mechanics are all men. Language and Cognitive Processes 23.464–85. doi:10.1080/01690960701702035.
Hahn, Matthew W., and R. Alexander Bentley. 2003. Drift as a mechanism for cultural change: An example from baby names. Proceedings of the Royal Society B: Biological Sciences 270.S120–S123. doi:10.1098/rsbl.2003.0045.
Haugen, Jason D., and Amy V. Margaris . 2020. Faculty placements into linguistics PhD programs across the US and Canada: Market share and gender distribution. Paper presented at the 94th annual meeting of the Linguistic Society of America, New Orleans.
Kennison, Shelia M., and Jessie L. Trofe. 2003. Comprehending pronouns: A role for word-specific gender stereotype information. Journal of Psycholinguistic Research 32.355–78. doi:10.1023/A:1023599719948.
Kibbey, Tyler. 2019. Transcriptivism: An ethical framework for modern linguistics. Paper presented at the 93rd annual meeting of the Linguistic Society of America, New York City.
Kibbey, Tyler; Lal Zimman; Archie; Chloe Brotherton; Will Hayworth; Joel N Jenkins; and Bryce McCleary. 2020. Queer and trans digital modalities. Panel at the 94th annual meeting of the Linguistic Society of America, New Orleans.
Kreiner, Hamutal; Patrick Sturt; and Simon Garrod. 2008. Processing definitional and stereotypical gender in reference resolution: Evidence from eye movements. Journal of Memory and Language 58.239–61. doi:10.1016/j.jml.2007.09.003.
Kurumada, Chigusa, and Bethany Gardner. 2019. 'You're good at math for a woman': An experimental analysis of gender-based microaggressions. Paper presented at the 93rd annual meeting of the Linguistic Society of America, New York City.
Lanehart, Sonja; Anne Charity Hudley; Jennifer Bloomquist; Dominique Branson; Kendra Calhoun; Tracy Conner; Jazmine Exford; Shelome Gooden; Jessi Grieser; Shenika Hankerson; et al. 2020. Black becoming for language and linguistics researchers. Panel at the 94th annual meeting of the Linguistic Society of America, New Orleans.
Language and Life Project. 2018. Talking Black in America. Film presented at the 92nd annual meeting of the Linguistic Society of America, Salt Lake City. Online: https://www.talkingblackinamerica.org/.
Lee, Jackie, and Peter Collins. 2010. Construction of gender: A comparison of Australian and Hong Kong English language textbooks. Journal of Gender Studies 19.121–37. doi:10.1080/09589231003695856.
Leonard, Wesley Y.; Megan Lukaniec ; Christina Laree Newhall; Kari A. B. Chew; Crystal Richardson; William Madrigal, Jr.; and Raymond Huaute. 2018. Sharing our views: Native Americans speak about language and linguistics. Panel at the 92nd annual meeting of the Linguistic Society of America, Salt Lake City.
Leslie, Sarah-Jane; Andrei Cimpian; Meredith Meyer; and Edward Freeland. 2015. Expectations of brilliance underlie gender distributions across academic disciplines. Science 347.262–65. doi:10.1126/science.1261375.
Lewandowski, Marcin. 2014. Gender stereotyping in EFL grammar textbooks: A diachronic approach. Linguistik Online 68(6).83–99. doi:10.13092/lo.68.1635.
Lieberson, Stanley; Susan Dumais; and Shyon Baumann. 2000. The instability of androgynous names: The symbolic maintenance of gender boundaries. American Journal of Sociology 105.1249–87. doi:10.1086/210431 .
Linguistic Society of America. 2016. Guidelines for inclusive language. Online: https://www.linguisticsociety.org/resource/guidelines-inclusive-language .
Linguistic Society of America. 2019. LSA statement on race. Online: https://www.linguisticsociety.org/content/lsa-statement-race.
Linguistic Society of America. 2020. The state of linguistics in higher education: Annual report 2019. Online: https://www.linguisticsociety.org/sites/default/files/Annual_Rept_Final_2019.pdf.
Liu, Bing. 2012. Sentiment analysis and opinion mining. Synthesis Lectures on Human Language Technologies 5.1–167. doi:10.2200/S00416ED1V01Y201204HLT016.
Macaulay, Monica, and Colleen Brice . 1994. Gentlemen prefer blondes: A study of gender bias in example sentences. Cultural performances: Proceedings of the Third Berkeley Women and Language Conference, 449–61.
Macaulay, Monica, and Colleen Brice . 1997. Don't touch my projectile: Gender bias and stereotyping in syntactic examples. Language 73.798–825. doi:10.2307/417327.
McConnell-Ginet, Sally. 2014. Meaning-making and ideologies of gender and sexuality. The handbook of language, gender, and sexuality, 2nd edn., ed. by Susan Ehrlich, Miriam Meyerhoff, and Janet Holmes, 316–34. Chichester: John Wiley & Sons. doi:10.1002/9781118584248.ch16.
Meyer, Meredith; Andrei Cimpian; and Sarah-Jane Leslie. 2015. Women are underrepresented in fields where success is believed to require brilliance. Frontiers in Psychology 6:235. doi:10.3389/fpsyg.2015.00235.
Mohammad, Saif M., and Peter D. Turney. 2013. Crowdsourcing a word-emotion association lexicon. Computational Intelligence 29.436–65. doi:10.1111/j.1467-8640.2012.00460.x.
Muller, Hanna; Phoebe Gaston; Bethany Dickerson; Adam Liter; Karthik Durvasula; Mina Hirzel; Kasia Hitczenko; Margaret Kandel; Paulina Lyskawa; Jacqueline Nelligan; et al. 2019. Gender bias in representation and publishing rates across subfields. Paper presented at the 93rd annual meeting of the Linguistic Society of America, New York City.
Namboodiripad, Savithry; Corrine Occhino; and Lynn Hou. 2019. A survey of linguists and language researchers: Harassment, bias, and what we can do about it. Plenary panel at the 93rd annual meeting of the Linguistic Society of America, New York City. Online: https://sites.google.com/umich.edu/lingclimatesurvey/home.
Pabst, Katharina; Paola Cépeda; Hadas Kotek; Kristen Syrett; Katharine Donelson ; and Miranda McCarvel. 2018. Gender bias in linguistics textbooks: Has anything changed since Macaulay & Brice (1997)? Paper presented at the 92nd annual meeting of the Linguistic Society of America, Salt Lake City.
Pyykkönen, Pirita; Jukka Hyönä; and Roger P. G. van Gompel. 2010. Activating gender stereotypes during online spoken language processing: Evidence from visual world eye tracking. Experimental Psychology 57.126–33. doi:10.1027/1618-3169/a000016.
R Core Team. 2013. R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing. Online: http://www.R-project.org/.
Richy, Célia, and Heather Burnett . 2019. Jean does the dishes while Marie fixes the car: A qualitative and quantitative study of social gender in French syntax articles. Journal of French Language Studies 30.47–72. doi:10.1017/S0959269519000280.
Sanders, Nathan; Pocholo Umbal; and Lex Konnelly. 2020. Methods for increasing equity, diversity, and inclusion in linguistics pedagogy. Proceedings of the 2020 annual conference of the Canadian Linguistic Association. Online: https://cla-acl.artsci.utoronto.ca/wp-content/uploads/actes-2020/Sanders_Umbal_Konnelly_CLA-ACL2020.pdf .
Silge, Julia, and David Robinson. 2016. tidytext: Text mining and analysis using tidy data principles in R. Journal of Open Source Software 1(3):37. doi:10.21105/joss.00037.
Tarrayo, Veronico Nogales. 2014. Gendered word (or world): Sexism in Philippine preschool English language textbooks. Journal on English Language Teaching 4.25–32. doi:10.26634/jelt.4.2.2795.
Valian, Virginia. 1998. Why so slow? Cambridge, MA: MIT Press.
Valian, Virginia. 2005. Beyond gender schemas: Improving the advancement of women in academia. Hypatia 20.198–213. doi:10.1111/j.1527-2001.2005.tb00495.x.
Van Fleet, David D., and Leanne Atwater. 1997. Gender neutral names: Don't be so sure! Sex Roles 37.111–23. doi:10.1023/A:1025696905342.
Zimman, Lal. 2019. Listening to trans+ voices: Trans-inclusive theory and practice for research on sex, gender, and the voice. Paper presented at the 93rd annual meeting of the Linguistic Society of America, New York City.

Footnotes

* We would like to thank audiences at Yale University, Brandeis University, MIT, the University of Oregon, and the LSA 2020 annual meeting in New Orleans for questions and comments. We thank the Yale Women Faculty Forum and Claire Bowern for providing funding that supported this work. Thanks to Monica Macaulay and Katharina Pabst, as well as three anonymous referees and editors Andries Coetzee and Megan Crowhurst for their careful comments on the manuscript. The first author would like to additionally acknowledge the members of the LSA Committee on the Status of Women in Linguistics, and in particular Katharina Pabst, Paola Cépeda, and Kristen Syrett, for their support of this and related work. Finally, special thanks to the broader community that assisted us in carrying this project through: the linguists and native speakers who assisted with identification of the gender of proper names: Will Bennett, Robert Daland, Michael Yoshitaka Erlewine, Henrison Hsieh, Beste Kamali, Sonia Kasyanenko, Monica Macaulay, Maria Polinsky, Norvin Richards, Eszter Rónai, Patricia Schneider-Zioga, Chaeyun Sheen, Jisu Sheen, Sverre Stausland, and Leah Velleman; and the Yale undergraduates who coded the data used in this study: Joshua Celli, Joe Class, Karina Di Franco, Zhiliang Fang, Stella Fitzgerald, Abigail Fortier, Michael Gancz, Calvin Kaleel, Nico Kidd, Amelia Lake, Shayley Martin, Georgia Michelman, Prastik Mohanraj, Serena Puang, Ronnie Rodriguez, Faren Roth, Oliver Shoulson, Slater Smith, Aarohi Srivastava, Lena Venkatraman, Nanyan Wu, Stella Xu, and Justin Yamamura.

1. Revised and expanded from the LSA's 1997 'Guidelines for nonsexist usage'.

2. In more detail, grammatical gender comprises formal morphosyntactic features, namely the properties of words that allow the formal grammatical process of agreement to be carried out. This includes agreement of grammatical gender categories such as masculine, feminine, neuter, common, and so forth. These features are properties of the morphemes themselves and may be independent from the real-world biosocial genders associated with the referents. See Ackerman 2019 for details.

3. This is a word-play on the homonym pair, a slang term for a woman's breasts.

4. We chose to exclude Macaulay & Brice 1997, published in Language, to avoid skewing the results.

5. This means that we did not explore any examples that appeared in the text of an article, outside of numbered examples. Furthermore, we recognize that our automated means of extracting examples may have caused some to be missed. If our method missed data, we have no reason to believe it would skew the data set in any meaningful way. Therefore, we believe that our results are representative and valid.

6. That is, we did not consider inanimate nouns with grammatical gender in languages such as Hebrew or German, such as ha-gesher 'the-bridge.m' (Hebrew) or die Brücke 'the.f bridge' (German), despite the fact that there exists research showing that the way speakers refer to such nominals does differ in stereotypical ways that follow from the grammatical gender assigned to them (Boroditsky et al. 2003). Here our focus is solely on arguments that refer to humans or anthropomorphized individuals. As anthropomorphized individuals were rare in the study, we refer to arguments as human in the remainder of the text.

7. We realize that this may result in some inaccuracies in the coding in cases where the structure of the language in question varied from that of English and the coders were not aware of the difference. For example, a referee provides the case of French tu me manques versus English I miss you, where the participants are in different syntactic positions. We believe such cases to be fairly rare. Further, the fact that our findings do not differ based on the language of the example (see §3.4) leads us to believe that this choice did not greatly affect our results.

8. These include arguments such as the students, who, and everyone and names such as Taylor, which can traditionally be assigned to both male-identifying and female-identifying individuals, when no corresponding pronoun or context can disambiguate the intended referent. We return to the case of names below.

9. We thank Susan Fischer (p.c.) for this suggestion.

10. Our spreadsheet included all names not familiar to us, for review by the community. This was especially important in order to minimize the number of non-Western names that were excluded from the study as unknown. See the acknowledgments footnote for the names of those who helped in this process.

11. Our data set additionally contains example sentences featuring thirty-five male-gendered, fifteen female-gendered, and seven nongendered 'geniuses'. Given these small numbers, we offer that data without further comment.

12. Interestingly, this is the opposite finding of the one in Pabst et al. 2018, where women are overrepresented in negative emotion cases. We do not offer a potential explanation here.

13. We once again acknowledge here that the binary classification is imperfect in the ways described above, especially when discussing working linguists rather than fictional individuals in example sentences.

14. The models reported in the text include all article authors (first and nonfirst authors). Models were also run considering the gender of first authors only. The results from those models show the same statistical trends as for all authors, and therefore we do not report on them separately here.

15. Pabst et al. (2018) report that female authors in their study are more likely than male authors to use nongendered arguments.

16. See in particular 'Job type by gender', pp. 15–17.

18. https://docs.google.com/spreadsheets/d/1GF6c5qFFzTqYGukRYia8WcSam48tBHm_R6MJB5tJPiI/edit#gid=0 . The document is described as follows: 'These are names (and pronouns, where possible) of people who responded to a tweet asking for nonbinary volunteers who'd be okay with their names being used in linguistics examples … The purpose of the list is to provide linguists with names to use in example sentences, which historically have suffered from significant gender bias.'

Share