DDL Meta-analyses

Last year saw the publication of the first large meta-analysis of DDL (Boulton & Cobb, 2017) and it had generally positive findings for the field; for example (emphasis added):

Focusing only on the most robust results (i.e., MVs with at least 10 unique samples in both P/P and C/E designs), 70% had large effects, 25% medium, and only 5% small or negligible. The most consistent large effects showed that DDL is perhaps most appropriate in foreign language contexts for undergraduates as much as graduates, for intermediate levels as much as advanced, for general as much as specific/academic purposes, for local as much as large corpora, for hands-on concordancing as much as for paper-based exploration, for learning as much as reference, and particularly for vocabulary and lexicogrammar. Many of these findings go against common perceptions, and the elements missing from the list (e.g., skills or other language areas) are for the most part missing because there is as yet insufficient research rather than because research evidence is against them.

I bolded “vocabulary and lexicogrammar” because in the past week another meta-analysis was published. This meta-analysis looked at corpus use and vocabulary learning (Lee, Warschauer, & Lee, 2018). They found that:

[O]ur meta-analysis showed a medium-sized effect on L2 vocabulary learning, with the greatest benefits for promoting in-depth knowledge to learners who have at least intermediate L2 proficiency. Corpus use was also more effective when the concordance lines were purposely selected and provided and when learning materials were given along with hands-on corpus-use opportunities. Moreover, we found that corpus use is still effective even without prior training and remains effective regardless of the corpus type or the length of the intervention.

The details of both papers are worth digging into, especially if one is statistically-minded. Furthermore, Boulton & Cobb make a number of recommendations for what kinds of studies and data could benefit the field in the future, while Lee, Warschauer, & Lee also mention that certain types of data/studies could help with the problem of small-sample sizes in future meta-analyses. In other words, if you do or are interested in doing DDL or corpus-use research, these papers point out gaps where research could be done and would be useful.

So the gist is that both papers are excellent contributions to understanding the effects of DDL and moderating influences, and they point to future research needs.


(Links are above)

Boulton, A., & Cobb, T. (2017). Corpus Use in Language Learning: A Meta‐Analysis. Language Learning67(2), 348-393.

Lee, H., Warschauer, M., & Lee, J.H. (2018). The Effects of Corpus Use on Second Language Vocabulary Learning: A Multilevel Meta-analysis. Applied Linguistics.

 

Advertisements

Me and I, I and Me: Frequency effects?

Russ Mayne wrote an interesting post a few days ago on the constructions “my wife and I” and “me and my wife”. I fully agree with him regarding the ‘correctness’ of either construction and the main thrust of his post, but I did have a thought about one very specific part of his post.

Here is the pertinent part of Russ’ post:

The words ‘Me and my wife’ are in the subject position (at the start of the sentence) and so we should use the subject pronoun ‘I’ 

English sentences usually start with subjects. so in ‘I love you’, I is the subject. If it were the object it would change to ‘me’ such as ‘you love me’. The sentence ‘me and my wife went to the party’ seems to flaunt this rule because ‘me’ is in the subject position and so it should be I.

The problem with this argument is, were it true, the sentence ‘I and my wife went to the party’ would be a perfectly proper sentence, after all, the subject is properly ‘I’. However, ‘I and my wife’ sounds a bit off to me. So is something else is going on here?

McWhorter makes the rather bold claim that ‘me’, not ‘I’ is in fact English’s subject pronoun and that I is a rather special word that is only used when there is only one subject before the verb. Therefore ‘I went to the party’ sounds OK, and ‘me and the lads went to the party’ sounds OK, but ‘I and the lads went to the party’ doesn’t sound right because there is more than one subject. I’d never heard this argument before but I’d welcome some disconfirming evidence.

I’m not going to claim that I have disconfirming evidence, but I would like to express my skepticism and suggest that frequency effects are playing a large role (there’s also an element in play here that pronouns are more slippery than many people realize).

I also find that the construction “I and my wife” sounds a bit off. However, I doubt the notion of “I is a rather special word that is only used when there is only one subject before the verb”. Rather, because I encounter “me” used in these constructions much more frequently than “I”, that is the construction that sounds right to me.

Here is a bit of data from the BNC:

meandwife

meandmyany

iandwife

iandany

Looking at these lines, “me and my wife” is more frequent than “I and my wife”, though only a relatively small sample was found (7:3). The open “me and my *” was more frequent than “I and my *”. In this case the ratio is 244:83. To me, those 83 occurrences of “I and my *” are interesting. The construction is in use. Is the sample large enough (are enough people using it) to assume these aren’t simply mistakes or production errors? I think for the moment it is. Presumably, the people using the construction don’t find anything about it that sounds wrong, or feel a need to stop and rephrase. And, as odd as it sounds to me, I don’t really think there is anything wrong with it.

A few years ago in Slate, Gretchen McCulloch wrote:

Your sense of English as a whole is really an abstract combination of all of the idiolects that you’ve experienced over the course of your life, especially at a young and formative age. The conversations you’ve had, the books you’ve read, the television you’ve watched: all of these give you a sense of what exists out there as possible variants on the English language. The elements that you hear more commonly, or the features that you prefer for whatever reason, are the ones you latch onto as prototypical.

I think that my preference for “me and my *” over “I and my *” isn’t rooted in anything inherent about the subjective/objective properties of I/me, but stems from the frequency of my encounters (including my own production) with the construction(s). In other words, I doubt that there is a binary I=sub. pronoun and me=obj. pronoun (or vice versa) quality at work here, but that either pronoun can be sub/obj depending on context, and that my idiolect (more specifically for the matter at hand, my internal grammar about which construction is preferred in certain situations) is influenced, if not driven, by frequency effects of usage.

In summary: I prefer “me and my *”. But the corpus data shows that some other people use “I and my *”. I think that both are grammatical and that the preference for one or the other is largely a product of frequency effects on the idiolect. That is, frequency of exposure affects whether something ‘sounds right’.

(One more complicating factor: I don’t know enough about phonology to be sure, but I wouldn’t be surprised if the majority preference for “me and my *” had a phonological aspect, too. Something along the lines of a preference for duplicating the intial “m” or avoiding the rhyme of “I and my”. Then again, I might just be showcasing my idiolectal preference for both of those things 😉 )

A smorgasboard of DDL journal activity

Last month, in addition to the release of new corpora, two journals released special issues dedicated to DDL/CL in language learning.

One is the open-access Language Learning & Technology. I haven’t read it yet, but the table of contents looks very interesting. The other one is Language Testing. It’s interesting to see how CL and questions of assessment interact.

Finally, though not a whole dedicated issue, ReCALL has an online first article titled ‘Unlearning overgenerated be through data-driven learning in the secondary EFL classroom’. This will be the first article I get to, as overgenerated be is a recurring issue for many of my students and I’m curious to see what the authors found.

What bounty 🙂


UPDATE

If the ReCALL link above isn’t working for you, here is the doi: https://doi.org/10.1017/S0958344017000246

 

 

Against the Capitol I met a lion, Who glared upon me, and went surly by

It is, I believe, generally accepted that who can be used to refer to animals (or at least some animals, or at least under certain conditions). Thus, Merriam-Webster’s online dictionary has this entry (I’ve outlined the key part in red):

mwwho

However, Merriam-Webster’s online learner’s dictionary has this entry:

mwldwho

No mentions that who can be used for animals. And it isn’t only MW. I checked six learner’s dictionaries and none of them said this was acceptable, or an option. This isn’t a huge deal, necessarily, but it can lead to confusion if, say, this has been taught as a ‘rule’ and then students read graded readers where the ‘rule’ is broken without any consequence (search for horse who, fish who, or monkey who in the Lextutor graded reader corpus, for example). Of course, there are tons of sources where students might encounter who being used for animals.

This came up because a student of mine had written “I don’t want a dog who is so big” and a peer suggested it should be which. And that’s FINE. It can be which :-). Or that :-). But it can also be who :-D.

For teachers who like to do consciousness raising or language awareness activities, this kind of situation provides opportunities for discussions of things like what if you read/hear some language in real life that doesn’t seem to match what the dictionary/grammar guide says, analyzing  lines of ‘controversial’ lexicogrammar patterns, or formulating ideas about why people choose to use who, or which, or maybe that in different circumstances (does it seem ok for certain animals but not others? is it special for pets? does it change the meaning/tone/nuance?).

Of course, the underlying ideas could be applied to other language questions, too. So, in general, the ‘corpus lesson’ here is that corpora can be used to explore alternatives to more conventional patterns and aid in developing greater language awareness. Corpus-use can be applied to not just learning frequent or common patterns of expression, but to expanding the ways in which learners are able to express themselves.


While talking about this with another teacher, it was suggested that maybe the learner’s dictionaries (and perhaps some other learner-oriented materials) don’t acknowledge who for animals as acceptable because it’s new (recent) and thus ‘non-standard’. But I have trouble seeing which of these would be considered ‘non-standard’ (in fact, I doubt that in many cases fluent English users would even notice this usage unless it were pointed out or they were looking for it). And it’s not really a recent thing, is it?:

Against the Capitol I met a lion,

Who glared upon me, and went surly by

Julius Caesar, 1.3 20-21


This corpus-based analysis (Gilquin & Jacobs, 2006) of who being used with animals may also be of interest.

Alive (and a meta-analysis)

Well, that was a longer break from blogging than I expected. I’ve had a tremendously busy winter, but hopefully I can get back to updating more frequently.

The first item on my list is to mention that Boulton & Cobb’s meta-analysis of DDL has been published. It adds context, detail, and discussion to their earlier slides.

It’s not a quick read, but the short version is that DDL works quite well in general; there are very encouraging results and several medium-to-large effect sizes were found. Going forward there needs to be more fine-grained research on for whom, for what, under what conditions, and for how long does DDL work well. They also make some important points about what information needs to be included in the future by researchers doing quantitative work on DDL.

To not mention*, it just feels right

@anthonyteacher has a great post at his site discussing the patterns “not to VERB” and “to not VERB”. He writes about his students’ reactions to the constructions, his own view, and some findings from Google N-grams and COCA. You should read his post in full.

I basically agree with everything he says, with one point he makes that I would like to extend a little bit. So I’d like to highlight this paragraph from his post, and especially the statement I put in red:

All of this data tells me several things. First, “to not” is on the rise, most likely due to the fact that the ability to separate an infinitive has become more accepted and “to not” has probably rolled in through a snowball effect. Second, the placement of “not” does not necessarily imply emphasis, as can be seen in the sentences above. Third, while my speech may make some of the older generations shake their first with anger, possibly telling me I am killing English, I can now reply confidently that my speech is the vanguard of an English where “not” is as placement-fluid as “they” is gender-fluid. My speech may be a speech that is likely to boldly go where few have gone before. Or to not boldly go, because language change is really unpredictable, and this is just a tiny thing.

I chose to highlight this section because I felt that sometimes my own choices regarding placement of “not” are definitely, if not necessarily, done for emphasis, but after thinking about it I don’t think it is a matter of placing emphasis per se. Rather, it is about restricting possible meanings/uses.

Let me explain. Here are two partial lines from COCA (query terms: not to mention):

1) … He would talk only if I promised not to mention he lived in …

2) … But tours and marketing materials, not to mention data on the average student, won’t tell you if that college will …

In the first line, I, personally, would probably phrase that as “to not mention”, though not necessarily. The point is that both constructions feel natural to me. However, I can’t imagine myself saying that about the second line. To me, and the way I’m processing these constructions, the first line’s meaning is straightforward, but the second line’s meaning is based on my understanding of “not to mention” as a fixed or partially fixed expression in this instance.

In this case, the construction is not simply negating the mentioning of something (in fact, the thing in question is explicitly and necessarily mentioned/understood). Indeed, the online Cambridge Dictionary, for example, defines “not to mention” as a phrase used when you want to emphasize something that you are adding to a list.

So, generally speaking, I process “to not VERB” as basically interchangeable with “not to VERB” (with a personal preference for “to not VERB”) when the meaning is straightforward (i.e. negating the verb). But “not to VERB”, perhaps because of it’s associations with certain fixed expressions, seems to me to have a broader range of usage. Something like this:

“not to VERB”: can negate the verb or have idiomatic/figurative meaning and usage

“to not VERB”: restricted to negating the verb

Here is a sample of COCA lines from @anthonyteacher ‘s post:

cocatonotkwicacademic

All the “to not VERB” uses here have meanings that can be understood as simply negating the verb. I suspect it would be this way throughout all the lines.

At least, until the language changes some more 😉


If I have said something glaringly, obviously wrong please tell me. Or if you have evidence of “to not VERB” used in an idiomatic/figurative way, please share it. Or if you have better choices for terminology etc. etc. etc. …

That rule that you don’t know that you know you know…is not a rule

A couple weeks ago a tweet from Matthew Anderson of the BBC blew up a little bit. The tweet had an excerpt from Mark Forsyth’s book The Elements of Eloquence: Secrets of the Perfect Turn of Phrase. The excerpt asserted that “adjectives in English absolutely have to be in this order: opinion-size-age-shape-colour-origin-material-purpose Noun. So you can have a lovely little old rectangular green French silver whittling knife. But if you mess with that word order in the slightest you’ll sound like a maniac.”

The tweet has more than 51,000 retweets and nearly 75,000 likes. It has also spawned articles like the following:

And a bunch more.

This is too credulous. Fortunately, Language Log was asked about the claim and explained how adjective order is not so simple. And many folks have put forth “big bad wolf” (size-opinion) as a clear violation of this so-called rule.

Forsyth responded by saying “big bad wolf” is actually a case of ablaut reduplication (think of words like zig-zag or tick-tock, they are never zag-zig or tock-tick) and, I think, he’s saying that this means it’s not subject to the rule that “adjectives in English absolutely have to be in this order”. If that’s the case, then absolute ≠ absolute.

But even if we let there be an exception for ablaut reduplication and decide that the ‘absolute’ rule for adjective order doesn’t apply in such cases, does the rule hold otherwise?

We can look for evidence.

The Language Log article mentioned above noted several hits in COCA for patterns that don’t fit the rule, including “big ugly”. I got on COCA myself to see what some of the concordance lines looked like, and ran a few other searches as well (other than “big ugly”, in the searches I purposely used words found in the example from the excerpt). Here are a few of the many, many hits that violate the ‘rule’:

Adjective pattern: size-opinion; Search terms: big ugly

  • This is a cute little town with a big ugly fire problem right now
  • On the platform was a big ugly man who talked loud and waved his arms around until his face was red
  • The construction of the big ugly office building was going very well

Adjective pattern: material-origin; Search terms: silver [j*] [nn*]

  • We gave her a small silver Ethiopian cross because we knew she was a devout Catholic

Adjective pattern: color-shape; Search terms: green [j*]

  • Inside the matchbox were nine little green triangular pills
  • One torn green triangular corner of the murdered Slap-of-Wack bar blows across the desert
  • he takes his pass from its slot in the green rectangular pass book before adding his name and time of departure to the long list

Adjective pattern: age-opinion, size-opinion; Search terms: [j*] lovely

  • seemed infinitely more beautiful for the presence of this new lovely white figure in their midst
  • At the doorway to the kitchen stood a tall lovely woman with long black hair and wearing a pink linen dress

None of these constructions sound maniacal.

I think the pattern Forsyth describes is a general preference for order that is usually going to result in good patterning, but it’s definitely not an absolute rule. The rules governing adjective order are messier than that (see the Language Log post). I only felt the need to comment and remark on it because I noticed several ELT-related accounts passing it around and I thought “Really?? Did anyone check the data??” This post doesn’t have anything really in the way of using corpora in the classroom or anything like that, but hopefully it can be a reminder that corpora are excellent reference tools (especially to check sketchy ‘rules’).

**Update**

I realized someone might object to the “silver Ethiopian cross” example since an Ethiopian cross is a particular style of cross. So here’s a different example from COCA that demonstrates the same pattern of material-origin (search terms: wooden [j*] boat):

  • a traditional wooden Bahamian boat hand-built in the early 1960’s