Yesterday I wrote that the expression “achieve a breakthrough” sounded odd to me, and I much prefer the more frequent “make a breakthrough”.

“Achieve a breakthrough” is, however, a robust collocation (after MAKE, ACHIEVE is the most frequent lemma with “a breakthrough” in COCA; and it has a high MI score). So why did it sound so odd to me? Have I somehow just not encountered it much? Well, maybe sorta.

In the COCA data, any form of ACHIEVE occurs immediately to the left of “a breakthrough” 24 times, while any form of MAKE in that position occurs 65 times, a ratio of about 1:3. COCA provides register/context information in its KWIC displays, and this info showed me that of ACHIEVE’s 24 occurrences, 2 were in spoken contexts (a little over 8%), while 18 of 65 MAKE hits were spoken (a little under 28%). So, the ratio in spoken contexts is 1:9. Alternatively, in non-spoken contexts the ratio is about 1:2.


I was in a speaking/listening skills oriented class, and we were talking about “achieve a breakthrough”. Perhaps the expression sounded odd to me because it’s not an expression I hear much, even if it would be unremarkable to me in a written context. That is, perhaps it is primarily an expression of written English, but not spoken, and thus it struck me as out of place or something.

I don’t actually think COCA is the best source of data about spoken English, but this sort of thing could be explored in other, maybe more suitable corpora.

Title note: when I said it sounded odd to me, I meant in a broad sense it seemed odd to me. Now I realize I may have unintentionally described how it really, literally, sounded odd to me.




________ a breakthrough

What is the first word that comes to mind to fill the blank in this post’s title?

If you’re like me, then it’s “make”. Obviously it doesn’t have to be “make”, but that’s what came to my mind.

We are at the beginning of our Autumn term here, and in chapter 1 of the coursebook there were a few vocabulary/collocation activities. Among them were these patterns:

  • _________ progress / a sacrifice / a profit
  • _________ your ambition / success / a breakthrough

The students had a bank of words to choose from to fill in the blanks, and this bank included make and achieve. Now, looking at the sets of words, it’s clear that make goes with progress / a sacrifice / a profit, while achieve goes with your ambition / success / a breakthrough.

However, “achieve a breakthrough” sounded a little odd to my ears. Not wrong. Odd. One of the pitfalls of constructing a collocation specifically for course/textbook purposes is that it often ignores other, more ‘natural’ collocations; and in this case actually ignored a verb that was in the exercise: “make”! A look at the top hits for “VERB a breakthrough” in COCA:


So, “made” and “make” are the top hits (by far), and then “achieve” in the 3rd spot. Now, of course “achieve a breakthrough” is fine and acceptable, and clearly attested in the data, but why would one teach that collocation without teaching “make a breakthrough” first (or as well)?

Anyway, I was going to wait a couple weeks before I introduced some corpus activities in class, but I decided this was a fine time to introduce and demonstrate a very basic use of a corpus. The two sections using this particular coursebook are relatively high level, and I was pleased to notice some students quite quickly grasping (through their asking of questions about what they were seeing) that corpora offer information about the target language that they wouldn’t get in a dictionary/grammar, and certainly not in their coursebook.

Call for Contributions: New Ways in Teaching With Corpora (deadline: September 3rd)

This seems to keep getting delayed / postponed. Perhaps not enough submissions? Perhaps not enough appropriate submissions. I don’t know. But if you were thinking of writing something, but got too busy or didn’t finish in time, then Good News, the new deadline is September 3rd.

Call for Contributions: New Ways in Teaching With Corpora

TESOL International Association is seeking submissions for a volume about classroom applications of corpora.

They are looking for straightforward, practical things that can be done with corpora.

I think the deadline was originally in the middle of May, but it seems to have been extended to July 15.

Check out the link above for more info about what kinds of submissions are appropriate and how to submit.

Against the Capitol I met a lion, Who glared upon me, and went surly by

It is, I believe, generally accepted that who can be used to refer to animals (or at least some animals, or at least under certain conditions). Thus, Merriam-Webster’s online dictionary has this entry (I’ve outlined the key part in red):


However, Merriam-Webster’s online learner’s dictionary has this entry:


No mentions that who can be used for animals. And it isn’t only MW. I checked six learner’s dictionaries and none of them said this was acceptable, or an option. This isn’t a huge deal, necessarily, but it can lead to confusion if, say, this has been taught as a ‘rule’ and then students read graded readers where the ‘rule’ is broken without any consequence (search for horse who, fish who, or monkey who in the Lextutor graded reader corpus, for example). Of course, there are tons of sources where students might encounter who being used for animals.

This came up because a student of mine had written “I don’t want a dog who is so big” and a peer suggested it should be which. And that’s FINE. It can be which :-). Or that :-). But it can also be who :-D.

For teachers who like to do consciousness raising or language awareness activities, this kind of situation provides opportunities for discussions of things like what if you read/hear some language in real life that doesn’t seem to match what the dictionary/grammar guide says, analyzing  lines of ‘controversial’ lexicogrammar patterns, or formulating ideas about why people choose to use who, or which, or maybe that in different circumstances (does it seem ok for certain animals but not others? is it special for pets? does it change the meaning/tone/nuance?).

Of course, the underlying ideas could be applied to other language questions, too. So, in general, the ‘corpus lesson’ here is that corpora can be used to explore alternatives to more conventional patterns and aid in developing greater language awareness. Corpus-use can be applied to not just learning frequent or common patterns of expression, but to expanding the ways in which learners are able to express themselves.

While talking about this with another teacher, it was suggested that maybe the learner’s dictionaries (and perhaps some other learner-oriented materials) don’t acknowledge who for animals as acceptable because it’s new (recent) and thus ‘non-standard’. But I have trouble seeing which of these would be considered ‘non-standard’ (in fact, I doubt that in many cases fluent English users would even notice this usage unless it were pointed out or they were looking for it). And it’s not really a recent thing, is it?:

Against the Capitol I met a lion,

Who glared upon me, and went surly by

Julius Caesar, 1.3 20-21

This corpus-based analysis (Gilquin & Jacobs, 2006) of who being used with animals may also be of interest.

Another SkELL Update

SkELL has another new feature.

In the WordSketch function, there is now a button that provides more context for the words co-occuring with the target word/lemma.

Automatic PoS tagging in the corpus sometimes results in errors, and this feature is meant to help with this problem. It doesn’t really prevent the errors, but it should help users make correct identifications despite tagging errors.

For more info, see this post from the FB CorpusCall group and play around with this example page from SkELL.

SkELL Update

I use SkELL quite often, so I was glad to read that there has been an update to the example sentences that get shown. This was an occasional, irritating issue because the kinds of SkELL-assisted activities I usually do with my students are hampered by spelling mistakes and such. For them, the learners, the cleaner data should be beneficial.

The update was announced by James Thomas on the CorpusCALL FB group, the text is reproduced in italics here:

“SKELL If you are a user of SKELL, you might have noticed a recent improvement in the quality of the example sentences. This is thanks to the deletion of sentences that contained spelling mistakes and hapax legomena. While both of these things can be of interest, it is better that the 40 example sentences of a word or phrase are as accurate as possible.

There are 10,370 instances of the word ‘dolphin’, for example, in the full corpus. The algorithm that chooses the best 40 for learners now works with cleaner data.

It’s a nice improvement. Thanks, Vit.”

**Link to SkELL**

**Earlier posts on using SkELL**