Six intermediate/advanced learners of English, studying in the United Kingdom, identified messages that they wanted to convey in specific future conversations and memorized native-like versions of them. Their ability to use them in practice and in the real conversation was analyzed. Propensity to attempt using memorized material correlated with aural-repetition aptitude, but neither propensity nor accuracy of reproduction correlated with proficiency. It is hypothesized that increased proficiency invites increased risk-taking during memorization. Individual differences in motivation and attitude, and the relationship with the interlocutor, are all discussed as salient variables.
RésuméSix étudiants de niveau intermédiaire/avancé apprenant l'anglais au Royaume-Uni ont défini les messages qu'ils voulaient transmettre lors de conversations subséquentes précises, et ils en ont mémorisé des versions correspondant au discours d'un anglophone. On a analysé leur capacité à employer ces messages dans le cadre d'exercices et de conversations réelles. La propension des étudiants à essayer d'employer le matériel mémorisé semble liée à leurs aptitudes pour la répétition orale, mais ni cette propension ni l'exactitude de la reproduction ne sont liées à leur compétence. À partir de ces résultats, on a émis une hypothèse selon laquelle il faut prendre plus de risques durant la mémorisation pour accroître sa compétence. On examine le rôle de variables importantes comme les différences personnelles sur le plan de la motivation et de l'attitude, et la relation avec l'interlocuteur.
Introduction
Multi-word memorization can have valuable communicative benefits for language learners (Hakuta, 1976; Jeremias, 1982), though opinion varies as to whether it does (e.g., Ellis & Sinclair, 1996), or does not (e.g., Granger 1998) contribute to wider learning. In China, memorization is [End Page 35] a popular approach to study (e.g., Au & Entwistle, 1999; Cooper, 2004; Dahlin & Watkins, 2000; Ding, 2005; Kennedy, 2002; Zhanrong, 2002) and can be highly effective, provided that it consolidates and/or facilitates understanding (Cooper, p. 294; Dahlin & Watkins, p. 67; Marton, Dall'Alba, & Tse, 1993). In this paper we explore what happened when learners memorized word strings and then tried to use them in real conversations.
Learners need real interaction if they are to practise using the language, but if they draw on their imperfect knowledge they are in danger of reinforcing bad habits rather than learning good ones. Ideally they need to know already how to express an idea in a native-like way, so that real interaction reinforces native-like behaviour. But how can such pre-knowledge be obtained? Extensive observation is one possibility, but only for learners in certain kinds of situation. Otherwise, the forms must be identified in a secondary context: off line. In the classroom, however, there are practical constraints on addressing the specific needs of one individual on a given occasion (Gatbonton & Segalowitz, 1988), and learners may struggle to adapt generic material to a specific situation. This study explored what happened when learners memorized a native-like way of saying exactly what they believed they would need.
The notion seems artificial and extreme, yet there is evidence that adult humans can navigate an interactional situation effectively using only prefabricated material. Some conversational aids for non-speaking disabled people (e.g., with cerebral palsy) store on computer specific material, ready for selection and speech synthesis when needed. One such system, TALK (e.g., Grant, 1995; Todman, Rankin, & File, 1999a, 1999b; Wray 2002b) was the inspiration for the present study, and a TALK user, Sylvia, acted as a consultant. We wanted to know whether the techniques that made conversation possible for Sylvia could be transferred to the language learner.
Extending the principle of TALK to the L2 context was an interesting challenge. One of several important differences between a TALK user and a language learner is that in TALK the computer will reproduce with 100% accuracy whatever the user decides to enter into its memory, while the language learner is subject to potential inaccuracies in both memorizing and recall. The flipside is that the TALK user must operate within the confines of faithful reproduction. Although TALK does have a facility for editing stored material, Sylvia did not use it, because it slowed down the pace of the conversation so much. She preferred to take a strong hand in directing the conversation towards the stored material. In contrast, language learners will all too easily resort to editing or indeed entire reformulation. Dörnyei, Durow, & Zahran [End Page 36] (2004), Skehan (1989), and others have suggested that the extent of formulaic language use might be affected by an individual's learning background, risk-taking tendencies, proficiency, aptitude, attitude, and motivation, and we consider these factors in relation to the participants' performances in the experiment.
The study
In this study, learners anticipated and memorized specific language strings for use in real-life conversations, and their use of them in practice and real conversations was observed. Of course, rote memorization cannot be viewed as a particularly realistic option for teaching and learning, and our purpose was not to trial a new method for improving idiomaticity in learners. Rather, we were interested in examining certain phenomena that can be effectively explored only in a very controlled learning situation:
- The learning characteristics of individuals who find it easy to memorize and reproduce prefabricated material
- The effect of being in the real conversation (compared with rehearsing)
- The extent to which effects are consistent across participants
- The characteristics of conversations that most support the use of prepared material
Additional considerations, relating to the linguistic forms most likely to be subject to deviations from the target are reported elsewhere (Wray & Fitzpatrick, submitted).
The design of the study enabled us to control certain variables that usually confound research into second language acquisition (SLA), such as whether or not the learners had a motivation for learning a particular form, whether the learned form was an adequate reflection of something they might actually want to say, and whether they were sufficiently familiar with it to be able to reproduce it accurately. The participants were intermediate to advanced learners living, studying, and in some cases also working in the L2 environment. A benefit of using such participants was that while it has long been recognized that prefabricated language is useful in the very early stages of language learning, less attention has been given to the acquisition and use of prefabricated strings by more advanced learners (Wray 2000), even though there is an obvious potential link between memorization and the mastering of idiomaticity (that is, the native-like expression of ideas). Therefore, this [End Page 37] study is particularly relevant to the specific issues of how memorizing native-like material, whether intentionally or inadvertently, can support idiomaticity in already accomplished learners and, conversely, what factors other than fundamental proficiency might compromise the attainment of idiomaticity in an L2.
Each student worked one-to-one with the researcher (a native speaker of British English) to identify a future conversation that she needed to have with a native speaker, and to predict, formulate, and practise utterances useful and appropriate to that conversation. The study was iterative: Each participant engaged sequentially in a number of self-contained conversation-preparation and -performance packages, each of which took place over seven to ten days. This gave participants the opportunity to choose when and how often they participated in the study, and enabled them to reflect on their learning and to experiment with different techniques. It also ensured that we could alter any design features of the study that were not working well, though, in fact, no such modification was necessary.
Participants
The six learners participating in the main study were all female and were temporarily resident in the United Kingdom as master's students in health science or development studies. We refer to them by identification codes (Table 1). Since the research was of a case study design, with intensive individual work and detailed qualitative analysis, it is particularly beneficial that it was possible to gather data from as many as six participants. The combined data make possible a number of analyses that could not be carried out with fewer participants.
|
Click for larger view |
Table 1
Participant information |
Methodology
Participants first completed a questionnaire giving details of their language-learning history, their daily use of English, their aspirations regarding English language use, and their beliefs about language learning. This last section was adapted from Horwitz's 'Beliefs about Language Learning Inventory' (1987). The participants also completed two vocabulary tests known to correlate with general language proficiency, and a battery of language-learning aptitude tests (described later).
The study cycle comprised a series of discrete stages for each target conversation. In stage one the participant, in discussion with the researcher, identified a conversation or transaction that she anticipated having with a native speaker of English within the next few days. These conversations were thus a direct reflection of the participant's genuine life needs. The conversations so identified included getting film developed at a local store, inviting a classmate over for dinner, asking a lecturer for an extension on an essay, and asking the advice of a vet about how to get hamsters to mate.
The participant then made a first attempt at what she would expect to say during the targeted encounter, thus providing a measure of her existing capacity to produce an accurate, native-like utterance. On the basis of the desired message that the participant articulated, the researcher offered the participant a native-like paraphrase. The paraphrases were colloquial and fully in keeping with local cultural practices, while accurately representing the individual's specific message preferences. Normally, around 10–12 native-speaker-like sentences/phrases were prepared for each planned conversation. The researcher digitally recorded these 'targets' and transferred them to CD for the participant to take away and memorize. No written version was provided, and participants were advised not to transcribe the material or make notes.
Stage two required participants to practise, in their own time, the target utterances by listening to each one and repeating it aloud as many times as possible. In stage three, the researcher met with the participant again, in order to check her progress in accurate memorization and to address any problems that had arisen. At this meeting a 'practice performance' of the conversation took place, with the researcher taking the part of the native speaker interlocutor. Stage four was the real-life conversation situation that the participant had anticipated. The participant's challenge was to achieve her interactional goals by using, as far as possible, the memorized target utterances. [End Page 39]
Stage five took place one or two days after the target conversation. The researcher interviewed the participant, asking for her assessment of the success of the conversation and a report of how easy it was to use the prepared material. Finally, two to three months later, selected conversations were further explored (stage six) by asking the participant, without warning, to recall as much as possible of the memorized material.
All stages were audio-recorded. After completion of their final cycles, the participants completed a written questionnaire, which elicited their comments regarding the ease, usefulness, and perceived success of the learning experience.
Participant data
Learning background, attitude, and motivation
In the initial questionnaire, all participants stressed the importance of watching television, listening to radio, talking with native speakers, and reading books in English. They all felt that learning grammar rules improved their English. They said that they planned what they were going to say before they said it, and if someone didn't understand them, they would try to say it another way. On the whole, participants were more confident about their language use in transactional than in academic contexts.
The Chinese participants' language learning had focused on grammar and pronunciation, with lots of memorizing and listen-repeat activities, whereas the Japanese participants' had focused on grammar, reading, and writing, with very few opportunities to speak, and with listening and speaking skills taught poorly. In the United Kingdom, the Chinese participants were living with other Chinese speakers and communicated mostly in the L1, whereas the Japanese participants had English-speaking housemates.
Responses to a question about what motivated their learning (Table 2) revealed a tendency towards instrumental motivation for all participants except Jo, whose desire to 'talk to native speakers' is more in line with characteristics of integrative motivation.
Jo's answers to questions in the Language Beliefs section of the questionnaire were consistent with her rejection of instrumental motives (she did not associate success at learning English with getting a good job) and with her more integrative tendencies ('I would like to improve my English so I can get to know British people better'). All participants except Sa wanted to speak English like a native speaker, with television [End Page 40] reporters and lecturers as their models. Jo, Sa, and Hi felt that memorizing complete phrases improved their English. In terms of language-learning beliefs, Ch differed from the other participants in that she strongly favoured accuracy over fluency, agreeing, for example, with the statement that 'it is important to speak English with excellent pronunciation' and 'if you do not know a word you should not guess it.'
|
Click for larger view |
Table 2
Motivation to learn English |
Language proficiency
The participants completed two vocabulary tests: the Eurocentres Vocabulary Size Test (EVST) (Meara & Jones, 1988) and Lex30 (Meara & Fitzpatrick, 2000). EVST is a yes/no test measuring receptive vocabulary size by presenting a list of words from different word-frequency bands. The subject indicates whether she knows the word, and the overall vocabulary size is extrapolated (a proportion of non-words is included in order to calculate the extent of guessing). Since there is a reliable correlation between vocabulary size and scores on other tests, such as grammatical knowledge (Meara & Jones), EVST offers a quick and efficient way of estimating general proficiency.
The Lex30 Test of Productive Vocabulary is framed as a word association task, but its actual purpose is to elicit a representative set of productive vocabulary. Participants are presented with 30 stimulus words and are asked to write at least three responses for each one, using free word association. The responses are categorized according to word frequency, and the Lex30 score is the total number of infrequent words, as a percentage of all the words produced. Like EVST, this test has been shown to correlate significantly with other measures of proficiency. For a full account of the Lex30 test, see Fitzpatrick & Meara (2004). The results of the EVST and Lex30 tests are plotted in Figure 1. [End Page 41]
|
Click for larger view |
Figure 1
Results of EVST and Lex30 tests |
The highest possible score on EVST is 10,000. A score of 6000+ roughly equates with an IELTS score of 6.5. Figure 1 shows that Hi and Ch were the most proficient in the tests (and, by standard extrapolation, the most generally proficient in English). Lo was the weakest in the EVST (receptive knowledge), and Jo in Lex 30 (productive knowledge).
Language learning aptitude
Participants completed the Lognostics Language Learning Aptitude Test (LLAT) (Meara, Milton, & Lorenzo-Dus, 2001). This test comprises a series of five computerized tasks that measure different aspects of language learning aptitude:
LAT A Aural memory. Participants listen to and reproduce unfamiliar sound strings.
LAT B Visual memory. Participants remember paired associates – words from a fictional language and their translations.
LAT C Inference of language rules. Participants make judgments about the grammar rules of a fictional language.
LAT D Recognition of unfamiliar words by sound. Participants indicate which words in a string are repeated. [End Page 42]
LAT E Linking of unfamiliar sounds and symbols. Participants make judgments about the spelling and pronunciation of fictional words, written in a fictional alphabet with associated sounds.
Results of the LLAT tests are given in Table 3. The final row of the table is a normative indicator, showing the median 40% range of performances obtained universally on the test. The maximum possible score for each test is 100.
Because of the cross-categorical nature of LLAT, it is rare to be able to single out any individual as being 'better' than others. Rather, the tests provide a profile of aptitudes to which language-learning proficiency or other aspects of performance might be attributed (see later).
Summative individual profiles
From the results of the tests and questionnaire we were able to compile a summary profile for each participant (Table 4).
Data analysis
In total, the six participants engaged in 21 conversation cycles, containing 227 model utterances – 10.8 models, on average, per conversation. The mean length of a model utterance was 10.05 words. The material produced in recall included one or more attempts at a target (in practice and/or real performance, plus a few instances of delayed recall around three months later). In the detailed analysis of the data (Wray & Fitzpatrick, submitted), targets that were never attempted were excluded, since it was unclear whether they had been memorized at all, and if they had, whether they had ever been deemed relevant for use. As a result, the main data consisted of a total of 2,416 memorized words. Table 5 shows the distribution of target material across the six participants.
|
Click for larger view |
Table 3
LAT test results |
|
Click for larger view |
Table 4
Learner summary profiles |
|
Click for larger view |
Table 5
Profile of data set |
Transcription
The data were transcribed from the audio recordings according to the following rubric:
Line 1 The 'initial idea' (I) was the attempt by the participant to express the desired message. The (I) provides a reference point for studying patterns in the subsequent output. [End Page 44]
Line 2: the 'model utterance' (MU) produced by the native speaker researcher, also referred to as the 'target.'
Line 3: the 'practice performance' utterance (PP) produced by the participant.
Line 4: the 'real performance' utterance (RP) produced by the participant in the real-life target conversation.
Line 5: for selected conversations, a 'delayed performance' utterance (DP) produced by the participant two to three months after the target conversation.
Samples are given in Figure 2. Lc is discussing a course assignment with her classmate. Jo is ordering photographic prints.
Quantification
|
Click for larger view |
Figure 2
Sample set of target utterance realizations from Lc 2:4 and Jo 2:7 |
The transcriptions revealed a complex mixture of similarity and difference in the rendering of the same message at different points in the cycle. In order to quantify the variation, two calculations were made. First, a participant's 'mean propensity to attempt target utterances' was calculated as target utterances attempted â target utterances prepared (Table 6). We prefer 'propensity to attempt' rather than 'willingness to attempt' because in some cases a participant who was willing, even eager, to attempt a target utterance, was unable to take advantage of, or to create, an opportunity to produce it. An 'attempted target utterance' was one [End Page 45] that had been produced partially or completely, with or without native or non-native-like changes.
|
Click for larger view |
Table 6
Mean propensity to attempt the model utterances |
The second calculation was a measure of an utterance's 'accuracy/completeness' (i.e., closeness of reproduction to the model): number of words produced with same form and function as in model target utterance â number of words in model target utterance. The stipulation that a word should have the 'same form and function' was in order to avoid counting words that happened to be identical in form to a target word but were not an instance of it, such as to as infinitive marker and to as a preposition. As an illustration, the 'closeness' scores for Lc 2:4 from Figure 2 are shown in Figure 3.
Results
Use of memorized material and accuracy of recall
The six participants varied in both their propensity to attempt utterances (Table 6) and their accuracy of recall. There was a significant correlation between cycles in the real performance (rs = 0.943, p < 0.01), indicating that an individual who attempted a lot of target utterances in the first conversation was likely to do so in the second. That is, it seems to have been the individuals' approach to the task, not, say, the topic or context, that determined the likelihood of attempting recall. On the other hand, no significant correlation was found between the proportion of target utterances attempted in real performance and either the EVST scores (r = -0.177, p = 0.738) or the Lex30 scores (r = -0.666, p = 0.149). It seems, then, that willingness to attempt a target utterance was not linked to proficiency, at least as measured by EVST/Lex30.
The results of the aptitude battery were compared with the proportion of target utterances each participant attempted, using a rank correlation analysis. There was a significant correlation (0.886, p = 0.019) between the proportion of target utterances attempted at RP by each [End Page 46] participant and her LAT A test score, but no correlation with the scores in LAT B, C, D, or E. LAT A measures aural memory by asking participants to listen to and reproduce sound strings. Participants with high scores on this test were more likely to attempt target utterances.
|
Click for larger view |
Figure 3
'Closeness' score calculations for sample data |
In accuracy, there was no significant correlation between closeness of reproduction and either the EVST score (r = -0.392, p = 0.442) or the Lex30 score (r = -0.407, p = 0.423), nor was closeness of reproduction correlated with any of the aptitude scores including LAT A. Although this may seem counterintuitive, in fact there is a potential explanation for it – attitude to risk – which we discuss later.
Relationship between practice and real performance output
When language learners find themselves participating in unanticipated real conversations, with genuine transactions to achieve, a complex situation is created. Without specific preparation, they are obliged to draw on whatever general linguistic resources are at their disposal: words and rules, and perhaps some formulaic material learned for similar but not identical contexts. They also have to navigate the transaction process itself, if they are not entirely familiar with the cultural norms of the exchange. An analysis of what is said in such circumstances cannot provide an accurate account of the learner's knowledge of the language: Has she raised her game to its highest level under the rush of adrenalin, or made unrepresentative production errors because of her focus on content rather than form? Has she constructed accurate output on line, displaying high proficiency, or retrieved a memorized, but not fully understood, formula, disguising low proficiency behind native-like output?
In this study, the unexpectedness dimension was reduced to a minimum, and there was a baseline of relevant linguistic material that we knew the participant had. This makes a comparison between the [End Page 47] practice and real performances of some interest. Given that the participants knew only one native-like way (the target) of expressing a given message (the transcripts of the I-lines confirm that they were generally a long way from knowing any others – see Figure 2, for instance), we hypothesized that, once the appropriate opportunity arose, the path of least resistance should be an attempt to reproduce the memorized string. This hypothesis should apply equally in the practice and real conditions. Any inaccuracies in the real performance that had not occurred in the rehearsal could therefore be viewed as a function of the pressures of real interaction, but excluding at least some of the unpredictability usually inherent there. The differences would therefore demonstrate, more directly than is normally possible, what happens to linguistic output under that condition.
For every conversation in the study, the proportion of target utterances attempted at RP was less than or equal to those attempted at PP (Table 7). A paired samples t-test revealed a highly significant difference (t = 5.455, p < 0.0001). This result shows that the 'real' situation did make it more difficult to use the memorized strings, though it cannot indicate whether this was because there was no opportunity to use them, or because when the opportunity arose, the material did not come to mind.
|
Click for larger view |
Table 7
Proportion of target utterances attempted at PP and RP |
In order to establish which of these explanations was most likely, two conversations were examined: those in which the difference between the proportion of target utterances attempted at PP and at RP was largest. They clearly indicate that there was unfulfilled opportunity. Sa, in a conversation with a classmate about how her essay preparation was going, used all 11 target utterances at PP, but only 4 at RP. Yet the conversation did in fact provide her with opportunities to use all 7 of the unused target utterances. For example, the interlocutor's 'Did you write 4000 words on it?' failed to prompt 'I could have written lots more about the Japanese experience, but I had to keep to the word limit.' The TALK research reveals many instances where Sylvia would make a match like this – it may not be entirely optimal, but it would be a reasonable fit. Lo, in one of her conversations, used 10 of her 11 target utterances at PP, but only 4 at RP. The conversation, however, provided opportunities to [End Page 48] use 4 of the 7 unused target utterances. For example, the interlocutor's 'Where would you like to go?' failed to prompt 'I'm not really sure yet, but I think I'd like to go to Paris.' These observations suggest that while reduced opportunity to use memorized strings certainly could be a factor, it was not the sole reason why fewer targets were attempted in the real performance condition than in the practice.
|
Click for larger view |
Table 8
Closeness of utterances attempted to their target |
Using the 'closeness' measure described earlier, we compared the accuracy (in relation to the target) of the recall in the practice and real conversations (Table 8). The utterances attempted at PP were significantly closer to the target than those attempted at RP (t = 3.574, p < 0.01). This result shows that even when a participant had the capacity and opportunity to use a well-formed word string, and attempted to produce it, changes were often introduced. This is exactly what Wray (2004) found with a learner of Welsh. In that case, because the learner knew so little other Welsh, it was possible to account for the changes as errors rather than deliberately chosen alternatives. In this study, however, the participants did have sufficient English to introduce changes deliberately, including stylistic ones and others that were necessary to modify the content detail. Later, we discuss the relationship between deviation and proficiency, as a function of the perceived risk inherent in not attending to detail.
Individual differences
In the TALK studies, Sylvia proved herself to be highly adept at controlling conversations, so that they did not stray far from the material that she had prepared. The ability and preparedness to take control of conversations is presumably determined jointly by personality and circumstances: Sylvia had very little choice but to draw on what she had prepared, if she wanted the conversation to remain fluent. Over time, she learned that certain features, such as adjuncts, reduced the usability of a word string and were better left out (personal communication). But she also had the strength of character to take on the role of dominant speaker. With personality a potential factor, we looked for evidence of inter-participant variation in how the memorized material was applied [End Page 49] in practice. Without doubt, some participants were more adept at manipulation than others. In Lo's conversation 2, she created a chance to use her target introduction model (Lo 2:1):
MU 'Hello, my name's Louise, I'm a student here.'
RP 'I think I have to introduce myself ... my name's Louise, I'm student here.'
Sa, having prepared for a conversation with her flatmate about a weekend trip to Edinburgh, found that the flatmate had even more recently been to Paris. However, she was able to manipulate the conversation back to Edinburgh, with 'So, this weekend you went to Paris and last weekend you went to Edinburgh, so how did you find it, did you enjoy it?' (Sa 3:1). She also improvised with 'So, if you can compare Paris and Edinburgh, which one do you like?' commenting afterwards, 'If we naturally continue our conversation, we talk about Paris,' showing that she had very deliberately brought the conversation back to a point where she could use her prepared target utterances.
Details from the post-study questionnaires offer further evidence that some participants were very conscious of employing a strategy that would enable them to maximize use of the memorized material. Hi observed, 'I just tried not to miss the chance when I could use the phrases.' Sa stated, 'In order to use the memorized sentences, I sometimes repeated the [utterance]. When I realized that I could have used the specific sentence after finishing the [utterance], I went back to the point and repeated ... with the sentence I originally intended to use.' Lc also reported that she sometimes manipulated the conversation so that she could use the target utterance.
Yet, as we saw above, propensity to attempt utterances did not mean that they were accurately produced. The post-study comments shed some light on this matter too. Some participants felt compelled not to produce utterances too close to the target. Lo, for example, claimed to make gratuitous changes to the memorized material: 'I just changed some different words, but it is the same meaning.' While Lc reported that the use of memorized native-like sentences helped her think in a more British way, with the result that she was able to communicate more effectively with British people, Da, a participant used in our pilot study, seemed sensitive to how the wordings challenged his cultural/national identity, observing: 'Sometimes I change [the phrases] maybe I think there is a difference between British thinking and Chinese thinking ... We have to do something in my thinking ... actually we ... haven't really changed Chinese thinking to English thinking so [End Page 50] sometimes I have to change some words just for me to easy to ... find a good way to express my emotions.' Comments such as these warn us against assumptions that a failure to reproduce an utterance close to the target is simply due to lack of competence or problems with retrieval.
Characteristics of conversations and targets
No attempt was made in the study to control for the type of conversation that was prepared, because it was important that the participant had free choice. However, it was possible after the event to categorize the conversations into three unequal groups:
- informal (interlocutor is a friend)
- formal (interlocutor is a colleague/tutor/boss)
- unknown (interlocutor and participant have never met before)
Table 9 shows the number of target utterances attempted in the RP conversation for these three conversation types. In the informal and unknown interlocutor conditions, the majority of the prepared utterances were attempted, but in the formal condition, only a little over one quarter were. The distribution is highly significant: 0 2 = 22.26, df = 2, p < 0.01. As regards closeness to the target, however, while the formal conversations were least close, the differences were not significant (Table 10).
|
Click for larger view |
Table 9
Target utterances attempted at RP for each conversation type |
|
Click for larger view |
Table 10
'Closeness' scores for each conversation type |
Two things may have been interacting here. We shall propose in the next section that closeness of recall could relate to the level of risk that the individual was prepared to take in paying attention to detail during [End Page 51] memorization. If so, we should expect to find that the level of deviation was constant between practice and real performance, and this was the case: There was a correlation of .749 (p < 0.001). Nevertheless, of the 40 target utterances that were produced with total accuracy at the PP stage and also attempted at RP, only 19 were produced with total accuracy in the latter. Once more, this raises the question of why, when a participant had demonstrated her ability to produce a target utterance accurately, she did not necessarily go on to do the same at the real performance stage.
Discussion and conclusions
Although it has long been recognized that the dynamics of real conversations can make it difficult for a learner to display her full ability, it is normally difficult to ascertain whether the problem lies with accessing what is known, or with other variables, such as managing the content and direction of the conversation, composing turns from scratch, and so on. The findings from this study indicate that even when a learner has conscious knowledge of having available in memory the native-like way to express the idea that she needs, the ability to reproduce that form is not necessarily sufficient to guarantee that it will be produced at all, let alone accurately, in a real conversation.
Tarone (1988) observes that when the focus is on form, the language produced is more target-like and monitored, whereas monitoring takes second place to fluent (rather than accurate) communication of meaning when focus is on function. Until the real conversation performance, all our learners' encounters with the target utterances (negotiation of target language strings, repetition and memorization of models, practising models with researcher) had been form-based. At the real performance stage of the cycle, focus switched to function: The target language was no longer an end in itself but rather became a tool for the achievement of a communicative goal. This point was evident from participants' immediate responses to the real conversation performances. For example, Lc's elation was evident after her telephone conversation inviting a classmate to dinner was successfully achieved. The same sense of achievement was expressed in the description of 'successful' conversations given by participants in the post-study questionnaire: 'I think we could communicate, I mean I could understand at least what he tried to explain, and I believe what I would like to say, he understood' (Hi). The post-study questionnaires revealed that participants perceived the most successful conversations as being those taking place in a relaxed setting, for example, 'I think the conversation "talking with [End Page 52] my flatmate" worked most successfully. As it was the quite informal talk with my friend in a relaxed mood, I could remember the sentence I memorized in the lesson comparatively well and could use them in the conversation.' (Sa).
We found differences in the learners' propensity to attempt their target utterances. However, no correlation was found with either proficiency or aptitude (with the exception of aural memory skill), so we need to turn to the attitudinal and motivational information supplied in the learners' questionnaires. Jo attempted all utterances in her first conversation, 10 out of 11 in her second conversation, and 15 out of 17 in her third. Her proficiency indicators were poor: She achieved the lowest score on the productive vocabulary test and the second lowest on the receptive test. Her aptitude test scores were also lower than her peers', and she was the only learner whose motivation can be described as integrative, rather than instrumental. Where her peers directed their language improvement towards academic or career success, she selected the statement 'I would like to improve my English so I can get to know British people better.'
Lambert (1974) describes integrative motivation as springing from 'a sincere and personal interest in the people and culture represented by the other language group' (p. 98). We can contrast Jo with Ch, who consistently failed to attempt the utterances she had prepared. Ch's profile was very different from Jo's. She scored very highly in the proficiency and aptitude tests, and her motivation was instrumental: She wanted to speak enough English to debate issues in class, and she strongly agreed that learning English would help her get a better job. Whereas Jo disagreed with the statement 'I feel shy speaking English with other people,' Ch agreed with it, demonstrating a sense of removal from the L2 user group.
We should not, of course, draw strong conclusions on the basis of such a small sample. However, our study does at least suggest that certain types of learner – perhaps those who are not responding particularly well to other modes of teaching – might find benefit from preparing native-like utterances as a means of accessing opportunities to operate effectively in the L2.
Yet there remains an underlying mystery: Why should memorizing a word string and knowing that you can reproduce it accurately not be enough to ensure that it is reproduced accurately? In particular, why is there no clear relationship between accuracy and proficiency?
In Wray & Fitzpatrick (submitted) we make the observation that as a learner becomes more proficient, she is more and more able to take risks with how much attention she pays to the precise detail in target [End Page 53] material. Memorization effort can be offset against the ability to reconstruct the original from more abstract semantic markers. For instance, a highly proficient learner need not specifically remember the form of the plural marker on a noun. It is enough to remember that the reference is to a plural entity, because the form can then be reliably reinstated. In this regard, the learner is emulating the native speaker, who is in a strong position to take many such risks with memorization, risks that may introduce deviations from the target (synonym substitutions, alternative grammatical formulations, etc). For the learner, a sign of real proficiency will be when similar kinds of deviations are made: The risks are taken, and where recall is incomplete, the deviations are native-like. However, the level of risk entailed may often be higher, because the learner may somewhat overestimate her capacity to reliably produce forms (productive knowledge) that she confidently recognizes (receptive knowledge), becoming beguiled by the familiarity of the target form into believing that it is under full productive control. The outcome could be deviations that are not native-like.
It follows that the greater taking of risks, although ultimately a sign of high proficiency, may actually increase the individual's vulnerability to error. Wray & Fitzpatrick (submitted) propose that deviations from memorized material can be used to measure proficiency, by calculating the level of non-native-like deviation from the target as a proportion of the total (native-like and non-native-like) deviations, representing the level of risk taken.
The patterns observed in this study also provide an opportunity to evaluate the predictions arising from Wray's (2002a) model of how languages are learned in adulthood. She proposes, specifically, that the adult's approach to language learning is constrained by both biological and cultural factors that combine to make it very difficult indeed to resist breaking down native-like material into smaller units. The effect of doing so is to make it hard to remember just how the units were originally combined, and in the process of making good the shortfall in this knowledge, the learner's inter-language rules must be employed, compromising the accuracy of the reproduction. Thus Wray predicts that errors will be introduced at the boundaries of the linguistic units perceived by the learner – a point that she was able to demonstrate effectively in her 2004 study of the Welsh beginner, whose input she had tracked from the start. Wray's prognosis, in fact, is not hopeful for most learners, who are predicted to remain victims of over-analysis – that is, unwrapping the packaging in the interests of more effective learning but, in the process, losing vital information about how to put the constituent units back together. [End Page 54]
Fundamentally, however, Wray's model also returns to the agenda of the individual, by predicting that the use and accuracy of memorized material will be contingent on what the learner's needs are. This study reveals that individual differences throw up many agendas that can conflict with the desire to sound native-like: achieving a successful interaction (Sylvia's priority in TALK), remaining flexible, and being true to one's own world-perception. Such deep-rooted motivations remind us that idiomaticity is not like a coat of paint that can be applied to all learners uniformly for a uniform effect. Idiomaticity is not the top layer, but rather is intrinsically tied into the bottom layers – the learner's personal identity and motivations in learning and using the language.



