Sounds odd (unintentionally literal)

Yesterday I wrote that the expression “achieve a breakthrough” sounded odd to me, and I much prefer the more frequent “make a breakthrough”.

“Achieve a breakthrough” is, however, a robust collocation (after MAKE, ACHIEVE is the most frequent lemma with “a breakthrough” in COCA; and it has a high MI score). So why did it sound so odd to me? Have I somehow just not encountered it much? Well, maybe sorta.

In the COCA data, any form of ACHIEVE occurs immediately to the left of “a breakthrough” 24 times, while any form of MAKE in that position occurs 65 times, a ratio of about 1:3. COCA provides register/context information in its KWIC displays, and this info showed me that of ACHIEVE’s 24 occurrences, 2 were in spoken contexts (a little over 8%), while 18 of 65 MAKE hits were spoken (a little under 28%). So, the ratio in spoken contexts is 1:9. Alternatively, in non-spoken contexts the ratio is about 1:2.

IMG_0237

I was in a speaking/listening skills oriented class, and we were talking about “achieve a breakthrough”. Perhaps the expression sounded odd to me because it’s not an expression I hear much, even if it would be unremarkable to me in a written context. That is, perhaps it is primarily an expression of written English, but not spoken, and thus it struck me as out of place or something.

I don’t actually think COCA is the best source of data about spoken English, but this sort of thing could be explored in other, maybe more suitable corpora.

Title note: when I said it sounded odd to me, I meant in a broad sense it seemed odd to me. Now I realize I may have unintentionally described how it really, literally, sounded odd to me.

 

 

Advertisements

________ a breakthrough

What is the first word that comes to mind to fill the blank in this post’s title?

If you’re like me, then it’s “make”. Obviously it doesn’t have to be “make”, but that’s what came to my mind.

We are at the beginning of our Autumn term here, and in chapter 1 of the coursebook there were a few vocabulary/collocation activities. Among them were these patterns:

  • _________ progress / a sacrifice / a profit
  • _________ your ambition / success / a breakthrough

The students had a bank of words to choose from to fill in the blanks, and this bank included make and achieve. Now, looking at the sets of words, it’s clear that make goes with progress / a sacrifice / a profit, while achieve goes with your ambition / success / a breakthrough.

However, “achieve a breakthrough” sounded a little odd to my ears. Not wrong. Odd. One of the pitfalls of constructing a collocation specifically for course/textbook purposes is that it often ignores other, more ‘natural’ collocations; and in this case actually ignored a verb that was in the exercise: “make”! A look at the top hits for “VERB a breakthrough” in COCA:

verb-bt

So, “made” and “make” are the top hits (by far), and then “achieve” in the 3rd spot. Now, of course “achieve a breakthrough” is fine and acceptable, and clearly attested in the data, but why would one teach that collocation without teaching “make a breakthrough” first (or as well)?

Anyway, I was going to wait a couple weeks before I introduced some corpus activities in class, but I decided this was a fine time to introduce and demonstrate a very basic use of a corpus. The two sections using this particular coursebook are relatively high level, and I was pleased to notice some students quite quickly grasping (through their asking of questions about what they were seeing) that corpora offer information about the target language that they wouldn’t get in a dictionary/grammar, and certainly not in their coursebook.

Against the Capitol I met a lion, Who glared upon me, and went surly by

It is, I believe, generally accepted that who can be used to refer to animals (or at least some animals, or at least under certain conditions). Thus, Merriam-Webster’s online dictionary has this entry (I’ve outlined the key part in red):

mwwho

However, Merriam-Webster’s online learner’s dictionary has this entry:

mwldwho

No mentions that who can be used for animals. And it isn’t only MW. I checked six learner’s dictionaries and none of them said this was acceptable, or an option. This isn’t a huge deal, necessarily, but it can lead to confusion if, say, this has been taught as a ‘rule’ and then students read graded readers where the ‘rule’ is broken without any consequence (search for horse who, fish who, or monkey who in the Lextutor graded reader corpus, for example). Of course, there are tons of sources where students might encounter who being used for animals.

This came up because a student of mine had written “I don’t want a dog who is so big” and a peer suggested it should be which. And that’s FINE. It can be which :-). Or that :-). But it can also be who :-D.

For teachers who like to do consciousness raising or language awareness activities, this kind of situation provides opportunities for discussions of things like what if you read/hear some language in real life that doesn’t seem to match what the dictionary/grammar guide says, analyzing  lines of ‘controversial’ lexicogrammar patterns, or formulating ideas about why people choose to use who, or which, or maybe that in different circumstances (does it seem ok for certain animals but not others? is it special for pets? does it change the meaning/tone/nuance?).

Of course, the underlying ideas could be applied to other language questions, too. So, in general, the ‘corpus lesson’ here is that corpora can be used to explore alternatives to more conventional patterns and aid in developing greater language awareness. Corpus-use can be applied to not just learning frequent or common patterns of expression, but to expanding the ways in which learners are able to express themselves.


While talking about this with another teacher, it was suggested that maybe the learner’s dictionaries (and perhaps some other learner-oriented materials) don’t acknowledge who for animals as acceptable because it’s new (recent) and thus ‘non-standard’. But I have trouble seeing which of these would be considered ‘non-standard’ (in fact, I doubt that in many cases fluent English users would even notice this usage unless it were pointed out or they were looking for it). And it’s not really a recent thing, is it?:

Against the Capitol I met a lion,

Who glared upon me, and went surly by

Julius Caesar, 1.3 20-21


This corpus-based analysis (Gilquin & Jacobs, 2006) of who being used with animals may also be of interest.

“I agree —.” Referencing with COCA

In the past couple weeks I have had reason to pull up frequencies or concordances in response to student questions on several occasions. For a teacher this is a pretty basic but useful application of corpora.

For instance, this past week in a business-focused ESP course we were reading through a role play which used the expression ‘I agree wholeheartedly’. A few types, or maybe lines, of questions arose. Is ‘wholeheartedly’ very common? Can I say ‘I wholeheartedly agree’? Can I say ‘I agree completely’? Can I say ‘I agree perfectly’? What if I agree but not that strongly? These didn’t arise all at once, but came about through discussion (some after I pulled up results in COCA).

Notably, to me, meaning was not an issue at all. All of the questions were regarding usage.

I ran a quick and relatively simple search on COCA using: I agree [r*]

([r*] is the PoS code for adverb)

Here are the first 20 results:

i-agree-sized

Across the entire corpus we can see that I agree wholeheartedly is the second most common formulation of the I+agree+adverb pattern. I agree completely is by far the most common. The top 4, 6 of the top 10, and 8 of the top 20 seem to be synonymous expressions of the sense of ‘100% agreement’.

We did not have time to go through each expression but the students said they appreciated seeing several examples of alternative phrasings. For the expression I agree perfectly, I actually thought it sounded a little strange when my student asked if it was ok, but here it was, attested. Only 2 hits, but after opening the concordance lines I saw how it was used and my hesitation about it evaporated. For me, this was a good reminder of how corpus reference can help me (and by extension my students) avoid over-generalizing from my own intuition.

The 9th most frequent hit, I agree in, is also notable, and my students asked specifically about it. So we took a look at its 6 concordance lines. 4 of the lines revealed the expression I agree in part and 1 revealed I agree in general; so they found some useful expressions for expressing various degrees of agreement to go along with hits such as (13) I agree basically, (18) I agree strongly, or (19) I agree somewhat.

We also briefly looked at the query: I [r*] agree

This resulted in a very different frequency list:

adverb-agree-sized.jpg

There are more overall hits for this pattern. I totally agree is the top hit, with I completely agree the second. Interestingly, I certainly agree is the 4th most frequent hit despite I agree certainly occurring zero times in the corpus. In fact, the I+adverb+agree pattern revealed several expressions that had no lexical correlates in COCA to the I+agree+adverb pattern. This wasn’t the point of the referencing in class, so we didn’t dwell on it and focused on the variety of expressions that can be used, but, for me, it was definitely food for thought. Agree?

Playing by ear and verbing by body parts: Using COCA to discover usage

Last week, a passage in the textbook for one of my classes had the expressions ‘he tunes it by ear’ and ‘they learn to play by ear’. For many of the students (not all, tho) this was a new expression, ‘[verb] by ear’, but in the context of the passage most of them figured out, on their own, that it meant the people tuning or playing instruments were able to do so without any sheet music or other aids. The textbook had a question using the expression ‘play by ear’ in the chapter review, so I felt it would to pay to explore the pattern a little.

So we had this pattern, ‘[verb] by ear’. What does ‘by ear’ really mean? And what verbs could fit into that slot? Several students agreed that it meant using one’s ears to do something. That was a fine start. But would it make any sense to say ‘sleep by ear’ or ‘imagine by ear’ with senses of using one’s ears to sleep or using one’s ears to imagine? No, not really. I asked my students if they could replace ‘by ear’ with a different expression in the phrase ‘they learn to play by ear’. Pretty quickly, they came up with ‘they learn to play by listening’. That’s more like it. So ‘[verb] by ear’ means that someone is using listening skills to [verb].

At this point, I directed the students to open up the newly-mobile-friendly COCA (every one of my students has an iPad) and enter the following query under the List function: VERB by ear.

play by ear 1
Search area highlighted in red

 

play by ear
VERB by ear: top 20 frequency list

Looking at the frequency data, it was immediately clear to them that ‘play(s)/playing/played by ear’ is, by far, the most frequent formulation of this pattern. ‘Learn(s)/learning/learned by ear’ is also quite a bit higher in frequency than other formulations of the pattern, most of which have only 1 or 2 occurrences in the corpus (note: ‘birding by ear’ is the title of a book for birdwatchers about learning to recognize birdsongs, and references to this book in the corpus make this formulation appear higher in the frequency list than one might otherwise expect).

Using this data my students and I discussed how the pattern ‘[verb] by ear’ can be used with a lot of different verbs, but in general they are safe knowing that it’s most often used with forms of play and learn.

For homework, students were assigned one phrase from a set of phrases: ‘[verb] by mouth’ or ‘[verb] by hand’ or ‘[verb] by foot’ or ‘[verb] by eye’; and they were told to do a similar analysis to what we did in class. That is, explain how the expression is used and find out what verbs commonly fit into the empty slot. Also, they were to find 3 example sentences (using concordance lines from COCA, an online learner’s dictionary, SkELL, etc.).

In our next class students who were assigned the same phrase sat together and checked/discussed their results, and then they taught members of other groups about their assigned phrase.


Overall, this activity went over very well. A big part of that is due to the BYU Corpora interface redesign. It’s so much cleaner visually, easier to navigate, much easier to search, less intimidating, and so much more mobile-friendly.

Importantly, considering other class and content needs, this did not take up a lot of (in-class) time. The activity on the first day took around 15 mins (in a 90 min class), and the activity on the second day took around 25 mins. Within the activity there were elements of individual, group, and whole class work, which helped keep it from seeming tedious.

Of course, there were other options for how to exploit the corpus, too. For example, instead of assigning expressions for the students to research, we could have had some fun by having them find expressions of the ‘[verb] by [body part]’ variety themselves, maybe discovering some surprising combinations. I suspect you can think of a few ways to positively tweak this kind of activity, too.

 

Frequency lists in pre-reading activities

This Saturday (6/4/16) I am showing a poster at the JALTCALL Conference in Tokyo. The poster describes a process for using Antconc, or any software that can generate frequency lists, to create wordlists that can be used by students during pre-reading activities. The basic idea is that students can use these lists to mark the words they don’t know or don’t feel confident about, and by doing so they create differentiated/personalized study lists. As a pre-reading activity, this can help students reach (or confirm) a word knowledge percentage threshold needed for comprehensible reading of a particular text. The underlying idea is actually quite flexible and doesn’t need to follow the exact steps as outlined in the poster, rather the process can be tailored to different contexts and settings. For example, I don’t always use the process to determine exact percents of word knowledge for each student, instead I might let students work in a group and research any unknown words together just before reading (especially if it’s a relatively short reading and I just want to ensure they will recognize most, if not all, of the words in a text).

If anyone is interested, below are a .pdf of the poster and a .docx  of the stoplist I used in the example in the poster (I made the stoplist based mostly on frequency data found in COCA). If you want to use the stoplist with a program such as Antconc, you should save it as a .txt file first.

Poster

Stoplist

For some background on word knowledge percentage thresholds for reading, check out The Extensive Reading Foundation’s Guide to Extensive Reading.

 

 

 

SkELL: Homonymy and Polysemy

One drawback when using SkELL is that it won’t differentiate between, say, lead/lead or the various senses of ‘rat’. The word sketch function will differentiate between parts of speech, but the easy-to-read concordance lines initially generated will have the various words, meanings, and senses jumbled up. However, this drawback can be exploited for the teaching of various kinds of homonyms and polysemous words.

There are several ways to do this, but I’ll only discuss one basic approach here. Take the word ‘sweet’. Maybe you have students familiar with the taste sense of the word, as in “The berries are rather sweet and juicy”. You could show them (or have them look up) the SkELL concordance lines for ‘sweet’. Have them mark off the sentences that they recognize as referring to sweet taste. This would leave several sentences that use ‘sweet’ in different senses, and your students could discuss what ‘sweet’ might mean in those other sentences.

skell sweet
Screenshot: Partial SkELL output for ‘sweet’

In the screenshot above, for instance, lines 9, 10, 19, and 20 appear to be describing something about people’s personalities. Discuss with your students what it could mean to describe a person as ‘sweet’.

Alternatively, students could use a dictionary to look up all/several of the senses of ‘sweet’, and then try to categorize the SkELL sentences according to each sense.

Regardless of how exactly you approach it, there are a lot of ways to exploit this drawback for teaching and learning purposes.

Any other SkELL tips?