The use of collocations across proficiency levels: a literature review

This literature review focuses on the use of formulaic language by English as a second language students (L2). Research on the field of phraseology has shown that mastery of formulas is central for fluency and linguistic competence (Ellis, 1996). Studies on the use of formulaic language by native speakers (Ellis et al., 2008) have shown that native speakers process these structures as a single word. Considering the use of formulaic language by L2 students, research has shown that this can be problematic to learners as they do not know the correct word association (Men, 2018). This paper presents a literature review on the studies of formulaic language, more specifically of collocations, used by L2 learners. The first part of this paper deals with the different definitions of collocations, while the second part focuses on studies on collocation use by L2 learners.


INTRODUCTION
Several studies (Erman & Warren, 2000;Biber & Conrad, 1999;Pawley & Syder, 1983) suggested that language is mainly composed of fixed or semi-fixed language sequences. Aside from formulas pervasiveness in language, research on formulaic language processing showed that native speakers of English (L1) process these formulas as one single element. Ellis (1996, p. 111), for instance, argues that formulaic language is perceived as a "big word -the role of working memory in learning such structures is the same as for words". Sinclair (1991, p. 110) agreed with this view, stating that formulas are a "single choice, even though they might appear to be analyzable into segments". While Perkins (1999, p.56) explains the use of formulaic language saying that "the main reason for the prevalence of formulaicity in the adult language system appears to be the simple processing principle of economy of effort", Wray (2005) argues that even though humans have the ability for processing language grammatically, or rather analytically, the preferred way of coping with language input and output is through chunks of language. In sum, previous research established that native speakers process formulaic language as a single word, nevertheless, these investigations do not account for how speakers of English as a second language (L2) use and process formulaic language. Sinclair (1991) proposes that language users deal with formulaic language based on two principles, the open choice and the idiom principle. The first one allows for new and creative uses of language, while the second refers to the use of frequent combination of words. These two principles are especially important when considering speakers of English as an L2, as it is unclear whether they rely on the idiom principle or in the open choice principle when using their second language. Ellis (1996) argues that L2 learners' acquisition of formulaic sequences differs from that of native speakers, in the sense that native speakers process formulas relying on semantic associations, while L2 learners rely on orthography and phonology, driving them to, possibly, make wrong associations based on orthographic or phonological confusion. In a recent study, Ellis et al. (2008) confirmed that native speaker process formulas based on different criteria than L2 learners. While the latter used formulas that are more frequent, the former used formulas that had a stronger association between words.
In research about reading and writing, it is established that the use of formulaic language gives more fluency to a text (Ellis, 1996). Nevertheless, in L2 writing, different studies (Boers & Webb, 2018;Wray, 2013;Paquot & Granger, 2012;Nesselhauf, 2005) have shown that the use of formulaic language can be an issue for beginners as well as advanced learners, with proficiency level impacting the amount of formulas used as well as the types of formulas used. Another issue with the use of formulaic language in written texts has been raised by Yoon (2016), who argues that each register is characterized by the use of distinct formulas. Therefore, the aim of this literature review is to describe how language development influences the use of formulaic language in L2 students.
In order to address this goal, this paper is divided in five sections. Section two describes the approaches found in the literature for the study of formulaic language and discusses the different definitions used in phraseology studies. Section three presents the methodology used in this literature review. In section four the results of the literature review are presented in light of the research questions. Section five discusses the implications and limitations of this study.

DEFINING FORMULAIC LANGUAGE
In this section, the different approaches used on research about formulaic language are described along with the different terms used to refer to frequent strings of words.

APPROACHES TO THE STUDY OF FORMULAIC LANGUAGE
In the field of formulaic language, different terms have been used to define the same object of study, while sometimes the same term is adopted to define different objects of study. One of the reasons for this is the different approaches that have been used for the study of formulaic language. Wray (2005), Durrant andMathews-Aydınlı (2011), andDurrant (2014) describe three research approaches. The first one is the phraseological approach (e.g. Cowie, 1998;Ermand and Warren, 2000) which analyzes the meaning of a word combination. This approach is concerned with the degree to which the meaning of a word combination is predictable based on the meaning of its parts. It might also analyze whether words with similar meanings can be substituted in a phrase (e.g jump through hoops or skip* through hoops). This approach usually relies on researchers' intuition of what is formulaic in a given language. Furthermore, Wray (2005) argues that this approach results in idioms rather than formulas. The second approach is the psychological approach (e.g. Wray, 2005;Ellis et al. 2008) which focuses on the mental processing and storage of language. This approach defines formulas as items, which speakers store and process as a whole. The third approach is the frequency one (e.g. Biber and Conrad, 2009;Hoey, 2005, etc) which focuses on the frequency of co-occurrence of certain linguistic combinations in a text. These linguistic combinations can refer to words, parts of speech, or semantic fields. The frequency approach is associated with corpus linguistics studies of formulaic language. Unlike the psychological approach, the object of study of the frequency approach are texts (written or spoken) produced by language users. One of the issues with this approach is that researchers have defined the limit of the string words being studied differently, producing different results.
Although these three approaches suggest that there are different phenomena being studied, Durrant and Mathews-Aydınlı (2011) highlight the fact that the psychological and the frequency approach look at the same phenomenon from different perspectives. Wray (2005) also argues that the usage frequency of these formulaic sequences is associated to how they are stored and processed in the brain, thus corroborating Durrant and Mathews-Aydınlı's (2011) argument. Furthermore, according to Henriksen (2013), nowadays many researchers adopt a combined approach using the frequency approach to find formulaic language, and then using their judgment to determine whether the words have a meaning relationship or using a frequency approach to determine the items to be tested based on a psychological approach. Therefore, both psychological and frequency approach will be taken into account in this literature review, while a phraseological approach, which focuses on idioms will not be addressed.

DEFINITIONS OF FORMULAS
One of the first definitions of formula can be found in Jespersen (1924Jespersen ( /1976 who said that formulas "must always be something which to the actual speech instinct is a unit, which cannot be further analyzed or decomposed in the way a free combination can" (p.88). Later, Bolinger (1976) would say that "our language does not expect us to build everything starting with lumber, nails and blueprint, but provide us with an incredibly large number of prefabs" (p.1). Fillmore (1979) also said that "a very large portion of a person's ability to get along in a language consists in the mastery of formulaic utterances" (p.92).
From the late 80s onwards, a plethora of terms have been used to define formulaic language, many of these terms related to the development of corpus linguistics tools and new ways to analyze language. Wray (2005, p.9) presents all of the terms found to describe formulaic language in the figure below:

4/15
BELT | Porto Alegre, July-Dezember 2019;10(2): e34129 Larissa Goulart | The use of collocations across proficiency levels Artigos Figure 1 -Terms used to describe formulaicity (Wray, 2005, p.9) While ideally a literature review concerned with the language development and the use of formulaic language would address all of the terms used, due to time and space constraints this literature review will focus only on collocations, which are defined below. This is because many studies on collocations and language development were published recently, while other definitions are not so prolific on this issue of language development. It is worth mentioning, though, that several recent studies (e.g. Staples et al, 2013;Huang, 2015) on lexical bundles have dealt with the issue of language development as well.
Durrant and Mathews-Aydınlı (2011, p.60) define collocations as successions of linguistic entities that are best learned as integral wholes or independent entities, rather than by the process of placing together their component parts, either because (a) they may not be understood or appropriately produced without specific knowledge, or (b) because they occur with sufficient frequency that their independent learning will facilitate fluency Men (2018) defines collocations as transparent in meaning (e.g. make a decision), rather than opaque as idioms (e.g. raining cats and dogs). While Durrant and Mathews-Aydınlı (2011) and Men's (2018) definitions deal mainly with meaning, it is worth taking into account the form collocations take. Collocations have restricted commutability, in other words the node word has a limited set of words that can co-occur with it (e.g. commit a crime). Furthermore, they are frequently strings of two or three words that occur with the following grammatical units: verb + noun, adjective + noun, preposition + noun, adjective + preposition, noun + noun, adverb + verb, adverb + adjective. It is not uncommon for researchers to deal with only one type of collocation, for example only verb + noun collocations, as will be discussed in this literature review.
Another important point regarding research on collocations is the spam of frequency. While some authors define collocations as two or three words that occur in adjacency, most definitions of collocations state that these words do not occur necessarily subsequently (Durrant and Schmitt, 2009). Words that co-occur can appear within a 4:4 spam, meaning that they can be separated by three words either to the right or to the left. Furthermore, collocates can be identified by association measures such as t-score or mutual information (MI). The use of these

5/15
BELT | Porto Alegre, July-Dezember 2019;10(2): e34129 Larissa Goulart | The use of collocations across proficiency levels Artigos two measures in order to define collocations was proposed by Durrant and Schmitt (2009). The t-score measures the certainty of an association between two words, emphasizing collocations that are very frequent. While the MI score indicates the degree to which two lexical items in a collocation occur more frequently than would be expected by chance.
For this literature review, collocations will be defined as any set of two or three words combined in which one of the words is a noun, adjective, adverb or verb, occurring adjacently or not. This definition is rather loose when compared to previous definitions, nevertheless one of the goals is to verify which methodology has been adopted in the research of collocations in L2 learners, therefore delimiting the definition would exclude papers that might be relevant.
Finally, considering the points raised in the discussion above, the guiding research questions for this literature review are: RQ1 -How does proficiency influence the use of collocations by L2 student?
RQ2 -What methodologies have been used to investigate the use of collocations across proficiency levels?
RQ3 -In corpus linguistics research, which registers have been investigated?
The aim of this section was to present the different approaches used in the study of formulaic language, and to describe the definition used for this literature review. In the next section, the methodology used to gather the papers is described.

METHODOLOGY
The first step of the literature review was to search in a major database using the term: collocation* OR formulaic language AND language development. For this, I have used Web of Knowledge (using the Arts and Humanities filter). I have also used the same search ("collection*" or "formulaic language" +"language development") on Google Scholar. With the results, I have analyzed the abstract of the papers in order to determine whether they presented research on the use of formulaic language by second language learners, and if so, if the papers mentioned the proficiency level of these learners. The second step was to go through the bibliography of the papers to check if there were any relevant studies to the research question. This literature review only used papers written in English.
The initial search resulted in 31 papers, nevertheless after further analysis some papers were excluded as they dealt with the effects of instruction in learners' production of formulas, the use of idioms by L2 learners, or descriptions of collocation tools to be used by learners. Studies that were only based on interviews with students about their own perception of development (e.g. Barfield, 2008) were excluded. Furthermore, some studies did not report learners' proficiency level, and were also excluded. Finally, 23 papers were analyzed for this literature review. Next section discusses the results of the literature review.

RESULTS
In this section the results of the studies on the use of collocations by L2 students will be discussed. It is worth highlighting that this overview takes into account specifically how these studies address the research questions, which means that other issues could be discussed based on the research reviewed on this papers.
In the total 23 papers were analyzed, most of them adopted a frequency approach to the study of collocations (Men,
In language development studies, most of them aimed at evaluating whether the use of collocations could be a predictor of proficiency level (Paquot, 2018, Crossley et al., 2015Koosha & Jafarpur, 2006;Bonk, 2000). Table 1, below, presents a summary of all of the papers reviewed. The following subsections address each research question separately.

7/15
BELT | Porto Alegre, July-Dezember 2019;10(2): e34129 Larissa Goulart | The use of collocations across proficiency levels 82 students of an MA in TESOL at a university in Jordania. The students were classified as advanced learners.

Artigos
Dictionary task completion: Learners were given 22 verbs, 12 high-frequency and 10 low-frequency.
Students were asked to provide adverb collocates for these verbs using only a learner's dictionary.
According to the authors, only 10% of the responses were appropriate, showing that even with the help of dictionaries learners were not familiar with the collocations for the verbs selected. Another finding was that subjects that used the dictionary had higher scores in the low-frequency verb collocates than those who did not use it.
Yoon (2016)  Corpus Linguistics: The author used the CollGram techniques, which is to analyze the use of bigrams in the corpus -regardless of parts of speech -and compare its association score based on the t-test and the MI score, with the same bigram in a large corpus, such as COCA or the BNC.
When comparing essays with higher grades and essays with lower grades, the association is stronger in the latter. The authors argue that the MI scores of the bigrams are positively correlated with the quality of the essays. Considering the longitudinal study, the analysis showed a significant evolution of the t-score between the first text and the last text written by students.

Argumentative Essays
Granger and Bestgen (2014)  The authors created a list of high frequency nouns in the native speakers corpus and then extracted the collocations from these nouns using the corpus and collocational dictionaries. The same procedure was adopted for the L2 corpus. The focus is on verb plus noun collocations.
The results showed that native speakers produced almost twice as many collocations as the L2 learners. Taking into consideration a comparison across levels, the author noticed that advanced and intermediate levels produced significantly more erroneous collocations than beginners. The number of collocations produced also increased with proficiency.

Argumentative Essays
Wolter and Gyllstad (2011) -Collocational links in the L2 mental lexicon and the influence of L1 intralexical knowledge.
The aim of this paper was to analyze the perception of collocations by native speakers and L2 learners.
The participants of the research were 31 Swedish advanced learners of English and 37 native speakers who served as a control group.

Lexical Decision Task:
All the collocations in this study were verb plus object-noun collocations. The authors of this paper created a list of 440 collocation items, with 99 items being real collocations and the other items being distractors or made up words and asked native speakers and L2 students to determine if the items were collocations or not.
The results showed that the response time of native speakers was shorter than that of L2 learners when determining whether a collocation was correct or erroneous. Furthermore, this study showed that L2 students were more successful with collocations that were congruent with their L1.

9/15
BELT | Porto Alegre, July-Dezember 2019;10(2): e34129 The results showed that lower--frequency collocations are more common in native speakers texts than texts from L2 students. Non-native speakers used more collocations with a stronger association.

Research
Proposals and Argumentative Essays.
Wang and Shaw (2008) -Transfer and universality: collocation use in advanced Chinese and Swedish learner English The purpose of this research was to describe the use of verb plus noun collocations in a corpus of advanced English learners.
The participants were 100 Chinese students and a 100 Swedish students at undergraduate level, who wrote short essays.
Corpus Linguistics: The authors chose the most frequent verbs in both corpora and analyzed their collocations in a collocation dictionary to determine if the collocations were correct or erroneous.
The lexical variety of the collocations was slightly higher for Swedish learners than for Chinese learners. However, the total occurrence of verb plus noun collocations was similar in both corpora, suggesting that regardless of L1, advanced students use the same amount of collocations. This study also showed that L2 learners of English use fewer collocations than native speakers.

Argumentative Essays
Revier (2008)  The results showed that there was a significant difference between students at first level and third level, but the same did not occur with the second level. Koosha and Jafarpour (2006) Data-driven learning and teaching collocation of preposition: the case of Iranian EFL adult learners This study had two main aims, the first one was to see if data driven learning would help students learn collocations with prepositions, the second one was to assess how proficiency played a role in the correct use of collocations with prepositions.
200 English major students at Iranian universities participated in this study. They were divided in different proficiency levels according to the Michigan. Students were divided in an experimental and a control group.
Translation task: Focusing on prepositions and their collocations (noun + prep, adjective + prep, prep + noun, verb + prep, prep + prep, and idiomatic expressions), especially those collocations that are difficult for Iranian learners.
The authors claim that collocation knowledge could be used as a measure of proficiency since the correct use of collocations correlate to learners' proficiency level.

Koya (2005) -The acquisition of basic collocations by Japanese learners of English
This dissertation investigated how learners' passive and active knowledge of collocation related to level of proficiency.
The participants were 130 Japanese university students at four proficiency levels, which was measured through the vocabulary size test.
Multiple choice task and a translation task: The first one evaluated learners receptive vocabulary based on collocation dictionaries and the second one was a productive task of 68 collocations selected by the author based on previous research.
Learners with a bigger vocabulary used more collocations correctly. According to the author passive knowledge of collocations also correlated with productive knowledge of collocation.
The results showed that students were able to identify correctly 55% of the collocations. It also showed that years of L2 study had no impact in the collocation test.
Nesselhauf (2003)  From all the verb plus object--noun combinations 846 were free combinations, 13 were idioms and 213 were collocations. From the collocations identified 56 were considered erroneous by native speakers.

Argumentative Essays
Altenberg and Granger (2001) -The grammatical and lexical patterns of make in native and non-native student writing.
This study investigated the use of collocations with high frequency verbs, especially make, by L2 students of English.
The researchers used the French part of the ICLE corpus, and essays written by Swedish students. Both students were classified as advanced. As a reference for native speakers use of collocations the corpus selected was the LOCNESS.
Corpus Linguistics: The authors analyzed the frequency of the verb make in the three corpora (FreI-CLE, Swedish students, and LOCNESS). The authors analyzed the collocations of make (in a spam of 3 words to the right).
The results show that the use of make as a delexical verb proves to be difficult for advanced learners in both language backgrounds.

Bonk (2000) -Testing ESL learners' knowledge of collocations
In this study the author sought to assess a collocation test, as well as determine whether language proficiency correlates to collocation proficiency. The researcher gave learners a reduced version of the TOEFL test and a version of the collocation proficiency test.

L2 students enrolled at University of Hawai'i participated in this study
Multiple Choice Task: The questions tested verb plus object-noun collocation, verb plus preposition and figurative-use-of-verb phrases.
The results showed a moderately high level of correlation between proficiency and collocational proficiency. Granger (1998) -Prefabricated patterns in advanced EFL writing: collocation and formulae.
The aim of this paper was to investigate the use of collocations involving adverbs in the ICLE corpus compared to the same structures in a corpus of native speakers.
This research used texts written by 56 French L1 students and 56 native speakers Corpus Linguistics: The author created a list of the most frequent adverbs used with the meaning of amplification in both corpora and then analyzed the words that collocated with those adverbs in the corpora.
The results showed that learners used more collocations that were congruent with their first language. Nevertheless, in total native speakers had more occurrences of collocations with adverbs.

Argumentative Essays
Farghal and Obledat (1995) -Collocations: a neglected variable in EFL The goal of this study was to show that basic collocations on the topics of food, color, and weather are a problem for learners of English. The authors created a list of 22 collocations in these topics and tested three levels of English users, asking them to complete the collocation.
The fill in the blank task was completed by 43 English majors. These were students in two levels, juniors and seniors. The translation task was completed by 23 English teachers.
Fill in the blank questionnaire and translation: The first test gave a word in English and asked the participants to produce its collocation, the second test gave an expression in Arabic and asked the participants to translate it to English.
The results showed that in both groups learners relied on synonyms, rather than using the appropriate collocation. Furthermore, for the translation task the advanced learners adopted different strategies, such as, paraphrasing and translating exactly as it is in their L1.
The purpose of this paper was to investigate the knowledge of verb plus noun collocations in advanced learners of English with German as an L1 The participants were 58 English majors at a university in Germany.
Cloze task and a translation task: These tasks focused on verb plus noun collocations The results showed that in the translation task advanced students mistranslated 35.1% of the collocations. In the cloze sentence task more than 50% of the collections were erroneous.

HOW DOES PROFICIENCY INFLUENCE THE USE OF COLLOCATIONS BY L2 STUDENT?
This section seeks to answer a question that seems rather simple: how proficiency influences the use of collocations by L2 students. However, the review shows that the answer to this question is not as simple as more proficiency, more collocations. All studies reviewed show that more advanced students do use more collocations -regardless of which part of speech the collocations investigated belong to. Nevertheless, there are more components to this question than previously anticipated. The first one is presented by Men (2018) and Revier (2008) which shows that it seems that there is a bigger increase in the use of collocations between lower level to intermediate level, than from intermediate to advanced. Another issue discussed in these papers is that even though advanced students use more collocations than lower level students, they do not show lexical variety (Wang & Shaw, 2008;, Granger, 1998, this means that advanced students tend to repeat the same collocations. Yet, when compared to intermediate and beginner students, advanced students use more low-frequency collocations Yoon, 2016) while these lower level learners use more collocations with high-frequency words. In addition, even though advanced students produce more collocations the amount of erroneous collocations found also increases with level (Men, 2018;Nesselhauf, 2003;Bahns & Eldaw, 1993). These authors noticed that L2 learners usually produce erroneous collocations within the same semantic field. Although advanced learners are the ones that use more collocations, when compared to native speakers they produce a bit over half of the amount of collocations used by native speakers (Yoon, 2016;Laufer and Waldman, 2011). Therefore, these results sustain the claim that collocations are an issue for L2 learners even in more advanced levels.
In sum, the research described in these papers shows that the use of collocations increases with proficiency, nevertheless L2 speakers do not show collocation variety. Hence, when compared to native speakers, their use of collocation falls behind in a frequency count. Furthermore, another difference in the use of collocations between lower-level and higher-level learners is the use of low-frequency collocations, while higher-level learners tend to use more low-frequency collocations.

WHAT METHODOLOGIES HAVE BEEN USED TO INVESTIGATE COLLOCATIONS?
As discussed in section two, there are three approaches to the study of collocations. This review analyzed only two of them, the frequency and the psychological approach. The frequency approach, which is associated with corpus linguistics methods, was used in 14 out of the 23 papers reviewed. The remaining 9 papers adopted a psychological approach. In this section, the methods used for the extraction or analysis of the collocations are discussed.
From the papers that used corpus linguistics methods some tendencies can be observed, the first one refers to the extraction of all combinations of two or three words, without assigning a frequency threshold or any type of association measure. In several studies (Men, 2018;Crossley et al. 2015, Namvar, 2012Laufer & Waldman, 2011;Wolter & Gyllstad;Li & Schmitt, 2009;Nesselhauf, 2003;Altenberg & Granger, 2001;Granger, 1998) it is unclear whether these researchers did not have any threshold measure or if they did not report it in their studies. This is especially important considering that if there was no frequency threshold then any combination of two words could be considered a collocation, if it appeared in the reference corpus, or was considered correct by the researcher. Considering association measures, only six studies (Men, 2018;Yoon, 2016;Durrant & Schmitt, 2009) report having used either MI or t-scores. The lack of frequency thresholds and association measures are an issue for studies of collocations as it makes the definition of this phenomenon quite broad and likely to impact the results of the studies.
Another point to be considered in collocation studies is the methodology used to determine whether a collocation was correct. Most of the studies that have compared the use

12/15
BELT | Porto Alegre, July-Dezember 2019;10(2): e34129 Larissa Goulart | The use of collocations across proficiency levels Artigos of collocations between native speakers and L2 students used corpora of native speakers in order to verify if the collocations were appropriate or not. Nevertheless, some studies relied on dictionaries (Men, 2018;Wang & Shaw, 2008) or a panel of judges (Li & Schmitt, 2009).
Considering that corpus linguistics is a methodology associated with frequency and computer tools, it seems that these researchers could adopt a centralized method of extraction of collocations and determining the appropriateness of each collocation.
Summarizing the results of the literature review considering methodology, it seems that the problems with the definitions of formulaic language find their roots in the different methods used to extract and analyze collocations. Some of the issues are that translation tasks, for example, are dependent on learners L1; while papers on corpus linguistics research entails different methods of collocation extraction.

IN CORPUS RESEARCH, WHICH REGISTERS HAVE BEEN INVESTIGATED?
As briefly mentioned in the introduction, different registers use linguistic features, one of them being collocations, distinctively. Biber and Conrad (2009), for instance, explore register variation based on certain linguistic features extensively. The aim of this question is to determine which registers have been described based on the collocations used. This is relevant as the results regarding the use of collocations can be an indicative of the register being investigated, as well as of the learners' proficiency.
As we can observe in table one, argumentative essays is the most researched register in collocation studies (Yoon, 2016;Laufer & Waldman, 2011;Li & Schmitt, 2009;Durrant & Schmitt, 2009;Wang & Shaw, 2008;Nesselhauf, 2003;Altenberg & Granger, 2001;Granger, 1998) with one study focusing on other registers, such as, research papers (Paquot, 2018), narratives or narrative essays (Yoon, 2016;Namvar, 2012), free journal writing (Crossley et al, 2015), critiques (Li & Schmitt, 2009); and research proposals (Durrant & Schmitt, 2009). There are two issues with the low variety of registers being studied. The first one is that the findings reported in these studies might be associated with the register being investigated. As Yoon (2016)'s study shows, there is register variation in the use of collocations even between narrative and argumentative essays. The second one is that most of these argumentative essays being analyzed were written for English classes (Yoon, 2016;Crossley et al, 2015;Durrant & Schmitt, 2009), either English composition or programs in intensive English, which do not represent disciplinary writing. The studies that represent disciplinary writing (Paquot, 2018;Namvar, 2012;Li & Schmitt, 2009;Durrant & Schmitt, 2009;Koosha & Jafarpour, 2006;Nesselhauf, 2003;Altenberg & Granger, 2001) were written by English or Linguistic majors.
In sum, even though there are plenty of studies on the use of collocations by L2 speakers, this review shows that most of them investigate the same type of speakers, English students in their English classes; and the same registers, essays. This shows that there is a gap in research on collocations to investigate other registers, taking into account disciplinary writing. The aim of this section was to present the results of the three research questions in light of the papers reviewed. The next section presents a discussion of the results and the limitations of this review.

DISCUSSION
After reading the studies reviewed, it is evident that Nesselhauf (2003) was right, when the author argued that collocational studies, even those using the same approach, have a hazy definition of collocations. This was evident by the different use of frequency measures, association measures, and even researchers' intuitions in the investigations reviewed. This literature review has mainly focused on language development studies, nevertheless, while reviewing this present paper, it became clear that there are other issues to be explored through a literature review. One of them being the role of L1 in collocation development. Some studies suggest that L1 plays a role in the correct use of collocations by learners (Men, 2018), while others suggest that regardless of L1, learners in the same proficiency level use collocations to the same extent (Wang and Shaw, 2008). A second issue that could be explored is how the parts of speech that form a collocation impact the acquisition of collocation; few studies (e.g. Men, 2018;Namvar, 2016) suggest that verb plus noun collocations are more difficult to be acquired than adjective or noun plus noun collocations.
Considering future research on collocations, it seems critical to report the word spam in which the collocations analyzed occur, and also to adopt association measures in the extraction of collocations, otherwise any string of words occurring in the corpus will be considered a collocation. Furthermore, researchers could investigate other registers specially to describe register differences in the use of collocations, as the study conducted by Yoon (2016) did for narratives and argumentative essays.
Finally, the results of the literature review show that even for advanced learners collocations are an issue. With studies showing that, even though more proficient students use more collocations, they also make more mistakes when using these structures. This confirms that Nesselhauf (2002) is correct when suggesting that teachers should teach collocations explicitly, since learners usually see them as open choices, rather than units. That is, English teachers should not assume that learners will acquire collocations just by encountering them many times in written texts and class materials. These words that collocate together should be taught as units of language in English classes.